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1. STATEMENT OF THE PROBLEM 


The usual performance criteria for control systems are expressed in terms of 
undamped natural frequency, relative damping factor, gain and phase margin, steady- 
state error, etc. At a higher level of sophistication, one is often faced with the need 
to optimize the control system design from the point of view of minimum fuel expen- 
diture, minimum time, maximum payload, or similar requirements. Optimal use of 
available resources is an implicit goal of every engineering design. The perennial 
demands for increased performance in aerospace control systems have not only re- 
vitalized the classical mathematics that provides the medium and foundation for opti- 
mality, but have also stimulated further research to improve available methods. 

Certain problems in this area are well defined and can be expressed in a mathe- 
matical format well suited to the application of the classical variational calculus. 
Among these are the classical problems of programming rocket thrust to achieve max- 
imum altitude( 20 ) and attain minimum fuel transfer between circular orbits. ( 21 ) But 
a multitude of related optimality criteria for launch or orbital trajectories, such as 
maximum range, minimum fuel, and minimum time, lead to progressively more diffi- 
cult analytical formulations and strain the capabilities of the classical variational 
methods. References 20 through 40 are only a representative list of studies in optimal 
trajectories. 

While optimization criteria can generally be clearly defined for space trajectories, 
they are not so clearly delineated for control systems dealing with "short-period" dy- 
namics. Here stability and constraints on the state variables are generally the para- 
mount considerations. It is often difficult to relate natural mathematical optimality 
criteria with meaningful physical parameters. Thus, for example, it is often expedi- 
ent to use some quadratic function of state or control variables as a mathematical 
measure of performance. However, it is not always possible to justify this choice by 
purely physical means. Indeed, any linear feedback system is optimal in the sense that 
it minimizes the integral of a quadratic function of state and control variables whose 
weig hting factors are appropriately chosen. Nevertheless, recent studies have indi- 
cated that meaningful criteria can be established in terms of specified closed-loop 
response (14) or model-following. (1®) 

The basic theories for control system and trajectory optimization are identical; 
the difference is in fundamental time constants, which will generally differ by several 
orders of magnitude. It is therefore unnecessary to distinguish between the two in a 
basic exposition. In defining the scope of this monograph, we are guided first by the 
general theme of this series, which is the presentation of established and proven 
methods that have been shown to be useful in the design of aerospace control systems. 
Second, it is intended to reflect the current state of technical development. Subject to 
the limitations of space and mathematical sophistication expected of the reader. 
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fundamental results will therefore be presented concisely and (hopefully) clearly, with 
a minimum of analytical abstractions. In addition to pedagogic examples, applications 
to realistic aerospace control systems will be discussed. Limitations and subtleties 
in the basic theory will be emphasized, especially as they affect basic design. 

The discussion of the general analytical tools of optimal control theory (variational 
calculus, gradient methods, maximum principle, dynamic programming) will be sup- 
plemented by examples of application and by recent extensions and generalizations. 

The monograph is intended to summarize the present status of optimal control theory 
for aerospace control systems and to facilitate the study of the numerous problems in 
this area, as delineated in the extensive list of references. 
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2. STATE OF THE ART 


An intelligent discussion of optimal control systems depends on defining an explicit 
measure of performance quality. Often this can be done only qualitatively. For exam- 
ple, a launch vehicle autopilot or satellite attitude control system may be required 
merely to exhibit "good" dynamic response characteristics, such as frequency, rela- 
tive damping, and low sensitivity to extraneous disturbances. This is essentially a 
problem in conventional design. If further demands are made on the system — such as 
performing a stipulated mission in minimum time, or with minimum fuel or energy 
expenditure — a more sophisticated analysis is required. 

There are essentially two aspects to the problem of optimal control. One is the 
proper formulation of the problem (which is not always trivial); the other is the develop- 
ment or application of an appropriate tool to solve the problem. 

The first is more an art than a science, since the formulation of a mathematical 
model for a real system must be a compromise between the needs for analytical tracta- 
bility and for including all significant dynamic features. The results of an analysis 
must therefore be interpreted with due regard for these factors. 

This monograph is concerned primarily with the analytic tools for solving optimiza- 
tion problems . The natural vehicle for this exposition is the classical calculus of vari- 
ations. Broadly speaking, the problem is formulated as follows. Assume a system 
whose dynamic properties are described by a set of ordinary differential equations. 

Some type of initial and terminal conditions are prescribed. One or more degrees of 
freedom are available in the form of a control function that must be programmed such 
that a stipulated function of the control and state variables is minimized (or maximized). 

In this fashion, we have the classical problem of Mayert, the solution of which is 
well established. Some qualifications are in order, however. First, this "solution" 
is often in the form of a system of nonlinear differential equations with two-point bound- 
ary conditions. The computational difficulties associated with this are far from trivial 
and often overwhelming. Second, the classical formulation includes no constraints on 
the state or control variables. This severely limits its usefulness, since, in the real 
world, bounds on energy and magnitude of a control force are hard realities. In some 
instances, a state variable must be contained within prescribed bounds — such as 
limiting an angle of attack to ensure structural integrity. 

As a result of these limitations in the classical techniques, several new approaches 
to the problem have been initiated in the last 10 years. These have led to the develop- 
ment of dynamic programming, the gradient method, the maxi m um principle, and the 

t Also called the problem of Lagrange or Bolza, when expressed in slightly different 
fashion. 
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generalized Newton-Raphson technique. With this new body of theory, the range of 
problems that can be resolved has been considerably enlarged. But as usual, few 
panaceas are available. 

Dynamic programming, which exhibits the greatest indifference to the presence of 
"pathological" functions, is limited by computer storage capacity. 

The gradient method, one of the more powerful techniques, generally yields only 
local extrema, primarily because small deviations from a reference condition are 
analyzed. 

The maximum principle, in which is included the effect of control constraints, is 
a more elegant formulation of the classical Weierstrauss condition in variational calcu- 
lus. It is, however, completely identical to the classical methods when the inequality 
constraints are included via the Valentine condition. 

The generalized Newton-Raphson technique is a powerful new tool for solving non- 
linear systems with two-point boundary conditions. It has been remarkably successful 
in solving problems that strained the capability of conventional methods. This, of 
course, has served to invigorate the classical approach, which is attractive because of 
its analytical elegance. The weakness of the method is that only sufficient conditions 
for convergence are known. Even these are hard to apply. It is known, however, 
that the method works in cases where the sufficiency condition is not satisfied. (In 
other words, this sufficient condition may not be necessary. ) Consequently, a certain 
amount of "eut-and-try" is necessary in practice. 

For any given problem, therefore, one may select one of a multitude of super- 
ficially unrelated methods to achieve optimization. Generally, the form of the problem 
or the type of results desired will indicate the method to be used. A fuller understand- 
ing of the virtues and limitations of each method requires recognition that all these 
methods are implicitly (if not explicitly) related. We have therefore not derived any 
of these results in the conventional m ann er but have shown (Sec. 3.1.5) that each is a 
consequence of one general formulation. 

The examples of application of the theory are an attempt to balance simplicity of 
exposition with practical realism. These examples, together with a sampling of the 
literature cited in the references, provide a fair indication of the current state of the 
art. 
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3. RECOMMENDED PROCEDURES 


The basic problem of optimal control can be stated generally as follows. Assume 
a dynamic system described by some appropriate set of differential equations . A 
parameter or set of parameters (controls) is to be programmed in such a way that a 
prescribed measure of performance takes on an extremal (maximum or minimum) 
value. Usually, there are prescribed values for the initial and/or final state of the 
system. Furthermore, there may be constraints, dictated by physical considerations, 
on the control or state variables. Taking account of all these factors, we seek to 
determine the control function form that ensures that the measure of performance is 
maximized, or minimized, as the case may be. 

There are many aspects to this problem depending on the complexity of the system, 
the form of the constraints, and the type of performance criterion adopted. Further- 
more a solution may be characterized as: open-loop , wherein the control is obtained 
as a function of time and therefore of the initial conditions; or closed-loop , in which 
case the control is a function of the current state of the system. 

All these considerations directly affect the mathematical complexity and the ease 
with which a solution may be obtained. This section presents the essential features of 
standard optimization techniques, beginning with the simplest problems expressed in 
the classical variational format. This is followed by a discussion of the classical 
theory limitations that motivated some of the modern techniques, such as the maxi- 
mum principle, gradient methods, and dynamic programming. 

Typical applications and standard solutions are then described. The concluding 
sections deal with recent theoretical developments , together with typical aerospace 
applications. 

3.1 MATHEMATICAL CONCEPTS 

The mathematical features of the basic optimization techniques are described in 
Sections 3. 1. 1 through 3.1.4. In general, the treatment is concise but explicit; the 
primary aim is to facilitate application of the theory to practical engineering problems. 
Elaborate motivations and mathematical abstractions are therefore avoided. Illustra- 
tive examples are used liberally to enhance comprehension of the basic ideas. 

3.1.1 Variational Calculus 

For the solution of optimization problems of a more general character than is 
possible via the elementary calculus , the classical tool is the calculus of variations . 
Most problems of interest to the aerospace controls engineer can be formulated in 
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terms of the so-called Mayer problem^ ^^), which includes a variety of related pro- 
blems as special cases. It is therefore of extreme generality and will accordingly be 
described in some detail. 

The Mayer problem is usually stated as follows: there are n functions, x k (t), of 
an independent variable, t, that satisfy the differential constraints 

<p. (x, x, t) = 0 j = 1, 2, , P (<n) (1) 

x(t) = n vector 

subject to boundary conditions of the type 

0 r ( x i’ x f’ V *f) = ° r = 1, 2, • • • • , q ( < 2n + 2) (2) 

where tj and tf are the initial and final times respectively, andt 


x. = x (ti) 
x f = x (t f ) 


<•) = 


d_ 

dt 


( ) 


It is required to choose the functions , x k (t) , such that the quantity 
If 


J = 


G (x, t) 


= G(x, t)| t=t -G(x, t)| t=t (3) 

f i 

is minimized subject to the constraints (1) and (2). 


iThe subscripts i and f are used exclusively to denote initial and final value respec- 
tively, not the components of a vector. Symbols other than i and f will be used to 
denote vector components. 
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The problem thus formulated is more general than is superficially apparent. For 
example, in the problem of Lagrange, the function to be minimized is 


t f 

J = f L (x, x, t) dt 


( 4 ) 


If we define an auxiliary variable by 
* n+1 (t) = / L (x, x, t) dt 


( 5 ) 


with 


Vi (t i> = 0 


then we reduce to a Mayer-type problem such that 
J = x n+l (t f> 

with the additional constraint 


<Vl s i n + 1 (t) ’ L(i ’ X ’t) = 0 


In the problem of Boiza , the function to be minimized is 


J = |g (x, t) + £ L (x, x, t) 


dt 


( 6 ) 


( 7 ) 


( 8 ) 


( 9 ) 


We again reduce this to a Mayer -type problem by defining an auxiliary variable 
in the manner shown above . 

Inequality Constraints 

If there exist inequality constraints of the formT 


K l< < 


K 


( 10 ) 


+The discussion also applies if the inequality constraint is on instead of 
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then by introduction of a new variable, x n+ ^(t), defined by^®®) 

(V *i)fc-*k)- Vi ' 0 (11 > 

the problem is put in the Mayer format, where J, defined by Eq. (3), is minimized 
subject to the added differential constraint 

«Vl - (*k - K l) ( K 2 - \) - Vl “ 0 <“> 

The Mayer format is therefore extremely general and finds wide application in 
aerospace controls problems. A solution is obtained in the following way. Form the 
augmented function 


-Ey, 

j=l J 1 


(13) 


where the X . are time -dependent functions called the Lagrange multipliers. A necessary 
condition for the minimization of the criterion function defined by Eq. (3) is that the 
Euler Lagrange equations 

d / d F \ 3F 


dt 

be satisfied. 


&)- 


9X k 


= 0 


k = 1, 2, 


, n 


(14) 


The system of equations (1) and (14) is subject to (2n + 2) boundary conditions. Of 
these, q are supplied by Eq. (2), while the remaining ones are determined from the 
transversality condition ^ 134 ) 


6 G +|F 


-if 

V i\ 6 t ♦ V ££ 6 x =0 
ti d \ 7 h ^ . 


(15) 


where 6 ( ) is the variation operator . In general , 




(16) 
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which means that Eq. (15) can be expressed as 



This expression splits up into (2n + 2) subconditions of the form 



k = 1,2, • • • • , n 



k = 1, 2, , n 

If the boundary condition for a particular variable, say xj^, is not prescribed, then 
6 is arbitrary, which means that 



Notice that boundary values cannot be assigned to nonderivated variables. These 
are, in fact, a mathematical consequence of the constraining equations (1) and the 
Euler Lagrange equations (14). 

If the augmented function, F, does not contain t explicitly, then it can be shown 
thatt 


tRef. 134, p. 205. 
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C = constant 


( 23 ) 



This means that Eq. (17) simplifies to 


-.f 



and the set of equations (18) through (21) is simplified accordingly. 

The solution of certain variational problems is characterized by the fact that one 
or more of the derivatives x^ are discontinuous. For example, in a rocket vehicle, 
the acceleration is discontinuous at the point where thrust is terminated or staging 
occurs. In this case, a mathematical criterion is needed to join the portions of the 
extremal arct at points of discontinuity. This is supplied by the Erdmann -Weierstr ass 
corner conditions^ 13 " 1 ) 



where the negative and positive signs denote conditions immediately before and after a 
point of discontinuity. 

For the special case in which F does not depend explicitly on t, the last relation 
reduces to 

(C)_ = (C) + (27) 

by virtue of (23). 3h other words, the integration constant, C, has the same value for 
all subarcs composing the extremal arc. 

Satisfying the Euler Lagrange equations, (14), merely ensures that function J, 
defined by Eq. (3), is either a maximum or a minimum. More precisely, it is a 
local extremum; in other words, it may be only one of many extremal functions. This 

tThe extremal arc is that curve in n space characterized by x^ (t), • • • • , x n (t), which 
extremizes J. 
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corresponds roughly to the case o' a curve In elementary calculus whose derivative 
vanishes at several points. The fact that the slope is zero gives no information as to 
whether the point on the curve is a maximum or minimum or whether it is indeed the 
global maximum or minimum. In the variational calculus, the necessary and/or 
sufficient conditions for type of extrema are not easily applied. Perhaps the most 
useful for engineering purposes is the Legendre-Clebsch condition , which states 
that the relation 


n n 

ZI 

k=l j=l 


2 

a f 


ix. > 0 
3 


(28) 


must be satisfied if J is a minimum. If the derivative, x r , of function Xj. does not 
appear explicitly in F, then in (28), we replace x r with Xj. • 


The above condition is merely necessary (not sufficient) to ensure that J is a 
minimum. Furthermore, only a local minimum is ensured. Fortunately, inmost 
cases of engineering interest, the type of extremal arc obtained is determined from 
physical or numerical considerations, and only rarely does an ambiguity exist between 
local and global extrema. 


The methods discussed above are illustrated in the following examples. 

Example 1, The Sounding Rocke t: This is one of the earliest aerospace problems 
treated by variational methods. ( 5 » 20 » 150 ) in the usual formulation, we seek the 
thrust-time history such that a maximum height is attained for a given propellant 
weight. Alternatively, we may seek the minimum propellant weight to attain a spec- 
fied altitude . The relevant differential equations are 


T - D = m (V + g) 

(29) 

h = V 

(30) 

T = PH 

(31) 

j8 = - m 

(32) 

T h thrust 
D s D (V, h) h drag 
m = instantaneous mass 
h = altitude 
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V = velocity 

g = gravity acceleration 

fx s exit velocity of burned propellant 

(} - propellant mass flow 

Physical considerations require that the mass flow satisfy the inequality 

0 < fi < P M (33) 

where S w is a given constant. 

M 

The problem is easily formulated as a Mayer type as follows. The differential 


constraints are 

p s h - V - 0 (34) 

<P 2 = v + g + = o (35) 

<p 3 h m + 0 = 0 (36) 

«P 4 S -P)- 0 ? = 0 < 37 > 


The last relation, which is in the form of Eq. (12), accounts for the inequality 
constraint (33). No special physical significance is attached to the variable, a . 

The problem variables are identified as follows. 
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The augmented function is therefore 


F - Aj (h - V) +X 2 (v + g + ~ +X 3 (* + g) 

< 38 > 

and the Euler Lagrange equations become 


X 2 3D 
m 3h 

(39) 

- -X. / 2sD 

1 m 3 v 

(40) 

Om-d)^ 

2 

(41) 

m 



(42) 

: 

(43) 


Since F does not contain t explicitly, we have, from Eq. (23), combined with the 
Euler Lagrange Eqs., (42) and (43), 

• C (44 » 

where C is an integration constant. 

It now remains to formulate the criterion function, J, and the boundary conditions, 
^) r . If we seek to maximize the altitude attained, subject to the boundary conditions 


h. 

l 

= 0 

t. = 0 
1 



V i 

= 0 

II 

o 


(45) 

m. 

l 

= m o 

s* 

II 

d 
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then J becomes + 


J 



( 46 ) 


Since tj and hf are not prescribed, the transversality condition, (17), yields 



The first of these relations implies that 
C = 0 

since G (= -h) is independent of t and 6 tf is arbitrary. Eq. (48) leads to* 


(47) 


(48) 


(49) 


(50) 


According to Eq. (43), an extremal arc is composed of subarcs along which «= 0 
or X 4 = 0. The first of these implies that 0 or f$ = by virtue of (37). If we 
write 


K 


= x 9 -x„ 

m2 3 


(51) 


then the condition \ 4 = 0 and Eq. (42) leads to 

K = 0 and K = 0 (52) 

Taking the time derivative of (51) and combining with the Euler Lagrange equations, 
we obtain 

m (D + m g) K = 0 ^ |^+D^K+X 1 [(V-^)D 

+ ( v fr~ mg ) w ] (53) 

t Obviously, the problem of minimizing (-h{) is equivalent to that of maximizing hj. 

* The notation - stands for \ ^ (t^). 
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after making use of Eq. (44). 


Consequently, along a variable thrust subarc, the relation 
(V- u )D + (v|^ -mg) y = 0 


(54) 


must be satisfied. This therefore determines the variable thrust parameter, g= -m. 

In order to ascertain the physical significance of this equation, consider a drag function 
given by 


D 


= c x V , 


= positive constant 


(55) 


Combining this with Eq. (54), we find 



This shows that a decrease in mass is accompanied by a decrease in velocity; in 
other words, the acceleration is negative. Consequently, the variable thrust arc is 
such that the vehicle is flown with thrust always less than drag during this phase. 


It now remains to determine how the subarcs for zero, maximum, and variable 
thrust are connected such that the optimality conditions are satisfied and in a manner 
that is consistent with the given boundary conditions. To do this, we make use of the 
Weierstrass Erdmann corner conditions and the Legendre Clebsch conditions. Eqs. 
(25) and (26) show that at each corner point 


Ui)_ = (Xj) + 

<V- = <v + 

(C)_ = (C) + 


(56) 


That is, the above parameters are continuous at the corner points of the extremal arc. 

Now since C = 0 and since V and m are also continuous, Eq. (44) shows that g 
may be discontinuous provided that 


(K)_ = (K) + = 0 


(57) 
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which implies that 


(K)_ = (K) + = 0 (58) 

Furthermore, the Legendre Clebsch condition (28) leads to 

-2 X 4 [(6 a ) 2 + (6 0 ) 2 j > 0 

Therefore, when X ^ ^ 0, we have by virture of (42) 

2 K f 2 2l 

* < 8 S) P° 

or, equivalently, 


0m- 2 0 > ° 


It was noted earlier that X 4 4 0 implies that p = 0 or P = /3 M - Consequently, the 
above inequality shows that 


P = fa when K > 0 

p = 0 when K < 0 (60) 


0 < P < when K = 0 

The last relation follows from the fact that for X 4 = 0 , a 4- 0 which means that p 

may assume an intermediate value as shown by Eq. (37). The quantity p (= -m) is in 

fact determined from Eq. (54). 

It now appears that the solution is completely determinate. The six equations, 
(34) through (36) and (39) through (41) together with the six boundary conditions 

h = 0 *if = 1 fro™ E< 3* ( 50 ) 

V. = 0 V. = 0 

l f 

m = 0 m„ = m 

i f P 
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may be integrated to yield m, V, and h as well as the Lagrange multipliers. Quantity 
K, defined by Eq. (51), is calculated in the course of the solution and determines the 
thrust switching points in accordance with Eq. (60). 


It is tempting to consider the problem as now solved. However, from a practical 
point of view, several factors disturb this state of euphoria. First of all, the difficulties 
associated with solving a system of nonlinear differential equations subject to two- 
point boundary conditions are too well known to be belabored here. In the present case, 
one must guess the initial values of X-p X 2 , and X3 in Eqs. (39) through (41) and hope 
that the final boundary conditions are satisfied. Various iterative procedures are 
available for adjusting the initial guesses to ensure that one converges to the stipulated 
final values. This is not always possible, since poor guesses lead to numerical insta- 
bilities, with the result that the engineer is often led to seeking other ways of making a 
living. Nevertheless, the practical importance of this problem has stimulated extensive 
research for improved methods. Some promising approaches are discussed in Appen- 
dix A. 

In the present problem, it is not necessary to solve the equations of motion in 
order to determine the general features of the optimal thrust program. It does require, 
however, that certain simplifications be introduced. Specifically, we assume that the 
drag is given by Eq. (55). Then, since D is independent of h, Eq. (39) reduces to 
Xj = 0, which means that X^ = 1 by virtue of Eq. (50). As a result, Eqs. (53) and (54) 
may be written as 

m (cf V 2 + m gj K = c x (3 V (2/1 + V) K + Q (61) 

Q = c 1 V 2 |^+lj-mg = 0 (62) 

We may plot the variable thrust condition, (62), in the m-V plane as shown in 
Fig. 1. The initial and final points on the trajectory are here designated by (l) and (fj^ 
respectively. It is obvious that an extremal arc begins with maximum thrust from 
point (1) (corresponding to the initial conditions mj = mg, and Vj = 0). The rest of the 
figure shows various possibilities of using maximum thrust, coasting, and variable 
thrust to arrive at the terminal point, @ , which is obviously approached via a coast- 
ing arc (constant mass). We will show that assuming the quadratic drag law given by 
Eq. (55) leads to a unique optimal thrust program consisting of the three subarcs shown 
in Fig. 2. Consider first the possible path (l) - (2) - (3) - (5) shown in Fig. 1. Along 
the maximum thrust arc, (l) - (2), we must have K > 0, while along the coasting arc, 

(2) - (3), K < 0 as required by the optimal switching conditions, (60). 
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Figure 1. Extremal Thrust Arcs in m-V Plane 
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But since Eq. (57) shows that K is continuous at a switching point, it follows that 


at point (2): K = 0, K < 0 

at point (3>. K = 0, K > 0 

In this case, Eq. (61) indicates that Q < 0 at point (2). and Q > 0 at point (3). Ex- 
pressed in terms of Eq. (62) 

°1 V 2 + ^ < m 2 g at point © 

°1 V 3 (‘/f’ + > m 2 g at P ° int ® 

According to Fig. 1, V 3 < V 2 , so that the above two conditions are incompatible. 

By similar arguments, we eliminate the subarcs (?) - ( 6 ) - ( 8 ) and (9) - (T3) - (0) . 

The optimal path must therefore be as shown in Fig. 2, composed of a maximum, fol- 
lowed by variable thrust, and then a coasting arc. 

If the final mass is sufficiently large, then there will be only a path of maximum 
thrust followed by coasting, as shown by path (T) - ( 2 ) - @ in Fig. 2. 

It may be shown that the solution is not completely determinate in all cases. Con- 
sider, for example, the situation shown in Fig. 3. The peak and trough in the variable 
thrust curve, Eq. (54), may be due to the peculiarities of the drag function near Mach 1. 
A variable thrust along (T) - (?) is not admissible, since this corresponds to increasing 
mass . Consequently, there must be some intermediate coasting path, (3) - (4)! How- 
ever, the precise location of this path cannot be determined from the present theory. 

We have analyzed this example in some detail, since it highlights both the virtues 
and the limitations of the classical variational calculus . The theory is well adapted for 
solving relatively simple (textbook) problems but suffers from severe limitations of a 
computational nature when applied to moderately realistic practical problems . This 
motivated the development of the so-called direct methods (gradients, dynamic pro- 
gramming), which generally exhibit marked computational advantages over the var- 
iational approach. Before turning to these, we will first discuss the Pontryagin maxi- 
mum principle, in which the classical variational calculus is highly systematized, and 
which is therefore much simpler to apply in practice. 

3.1.2 Maximum Principle 


A wide class of optimum control problems can be formulated as follows. Given 
the system 

x = f (x, u, t) (63) 
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where X is an n-dimensional state vector and u is an m-dimensional control vector. 

The components of the latter are subject to constraints of the form 

i/. < u. < fi. (64) 

j = 1, 2, ••••, m 

where the i/j and n . are known constants . It is required to determine u (t) in the in- 
terval , tj < t <tj, ^such that the criterion function 

J = c T x f (65) 


is a minimum (maximum). Here c is a constant n vector that is known, and the notation 
Xf means x (tf). In other words, we seek to minimize (maximize) a prescribed linear 
combination of the final value of the state vector components . 

As noted in Sec. 3. 1. 1, many types of criterion function can be reduced to form 
(65). Thus, if it is required to minimize (maximize) the integral 

E 

L (x, u, t) dt (66) 

i 



we define a new state variable by 


/ 


x n+1 (t) = / L (x, u, t) dt 


with 


Vi <v = 0 


and add the equation 


x q+i = L (x, u, t) 


(67) 


( 68 ) 


(69) 


to system (63). With these modifications, the problem is reduced to that of minimizing 
(maximizing) the quantity 

J = X q+1 (t f ) (70) 
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Furthermore, the problem of minimizing (maximizing) some function of the final 
state vector, ( (x^) , reduces to the same format if we define an additional state 
variable 


Vi = c « 


(71) 


witht 


x 


n+1 


(t.) = C (Xj) 


and add to the system, (63), the equation 


(72) 


n+1 


n 

E 

j=l 


dx. 

3 


(73) 


The new problem is then reduced to that of minimizing (maximizing) the quantity 


J 


= x 


n+1 


(t f ) 


(74) 


In order to avoid any possible ambiguity due to the notation adopted, we emphasize 
that the subscripts i and f are used exclusively to denote initial and final values re- 
spectively. They are not to be interpreted as components of a vector . Vector com- 
ponents will be designated by subscripts j, k, etc. Thus 


x. 

i 


X 1 <v 

x 2 <V 


<v 


XU 

Xj = j n component of the vector x 
X jf S x j (t f>* etc * 


One frequently encounters the so-called "minimum-time" problem, in which it is 
required to transfer the system from a given initial to a given final state in minimu m 


t The symbol x- means x (tj) . 
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time. Mathematically, this requirement is expressed as that of minimizing the 
integral 



'i 


By introducing a new state variable such that 


n+1 


1 ’ X n+1 <V = ° 


(76) 


x 


n+1 



(77) 


l 


the problem is reduced to one of minimizing the final value of x n+1 * 

In what follows, it will be assumed that all necessary transformations have been 
accomplished; we will deal exclusively with the set of equations (63) through (65). 

We now define the components of a vector, X, as follows!. 


X. 

] 


n Sf, 
k=l 3 


(78) 


j = 1, 2, 


n 


We also define the function 


H 



T 

= X f 


(79) 


One sometimes refers to X as the costate vector ; the function, H, is known as the 
hamil toman . 

The maximum principle of L. S. Pontryagin(^) is then stated as follows: 

"In the system described by Eq. (63), if there exists a control vector, u, which 
satisfies the constraints, (64), while minimizing (maximizing) the criterion function, 
(65), then this control vector necessarily maximizes (minimizes) the hamiltonian (79)." 

tOur notation anticipates the interpretation of this quantity as the Lagrange multiplier. 
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Making use of the definition of the hamiltonian, Eq. (79), we may write Eqs. (63) 
and (78) as 


x. = 


a h 


X = - 


a h 
a x. 


3 = 1, 2, ••••, n 


3 = 1. 2, ••••, n 


(80) 


(81) 


These are identical in form to the Hamilton canonical equations in analytical 
mechanics (which accounts for the name accorded the function H). 


In application, the extremal control vector, u, is obtained as a function of X after 
the minimizing (maximizing) operation on H is performed. A completely determinate 
solution is then obtained by integrating the 2n equations (80) and (81). This, in turn, 
requires that 2n boundary conditions be specified. However, there are only n boundary 
conditions immediately apparent, namely. 


x j <v 



i = 1, 2, ••••, n 


(82) 


The other boundary conditions depend on the constraints imposed on the final value 
of the state variables and the form of the criterion function, (65). We distinguish 
several cases. 


I. Final Time Fixed; No Constraints on Final State Variables 


In this case , the remaining n boundary values are 

X. (t f ) = - c. (83) 

j = 1, 2, , n 

where the c ^ are the components of the vector c in Eq. (65) 

II. Final Time Fixed; Constraints Imposed on Final State Variables 

One type of constraint imposed on the final state vector is that it lie within a 
prescribed region of the state space; viz. , 

4 > [x (tj)] < 0 (84) 
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In this case, if the function ip is differentiable, we write 


b. 

J 



t = t r 


(85) 


and the boundary values for A are given by 



*, <y = - V 1 b. [x<v] 


(86) 

j= 1, 2, •••• 

, n 


o 

11 

IF 

-a- 


(87) 


where rj is a constant. There are (n+1) equations in the set (86), (87) for the (n+1) 
unknowns A j (tf) , j = 1, 2, , n, and the constant T). Thus one of the above equa- 

tions may be used to eliminate rj, which in combination with the initial values (82) 
yields the required 2n boundary conditions for the set (80), (81). 

In the special case where q (< n) components of x ^ (tf) are prescribed exactly, 

W ' *j f (S8) 

j = 1, 2, , q (<n) 


the remaining boundary values are given by 

VV = - c k 

k = (q+1) , 


(89) 


Etl. Final Time Open 


Since there is an additional degree of freedom in the system, an additional equation 
is required to make the solution determinate. If tf does not appear in the criterion 
function (65), then 


H (t f ) 



0 


(90) 


is the additional relation required. If tf does appear in J, then the variable x Q+1 is 
introduced such that 
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1 


n+1 


Vi ( V = 0 


The final value of x n+1 replaces in the criterion function, and the added relation 
takes the form 


n+1 | 

H (t f ) = £ A f = ° (91) 

j=l 3 3 't=t f 

The statement of the maximum principle, together with the boundary conditions as 
specified above, contains all the information necessary to solve the optimal control 
problem. There is, however, one possible ambiguity and this is discussed next. 

Singular Solutions 


For a certain class of problems, the control, u, is a scalar which enters the 
hamiltonian in a linear manner; viz.+ 

H = I (x. A) + u K(x, X) (92) 


where I and K are scalar functions that are independent of u. Here it has been as- 
sumed that the explicit dependence of the system equations or criterion function on 
time, t, has been removed by defining an additional state variable x Q+ ^ such that 


n+1 


= 1 


n+1 


( 0 ) 


0 


It is possible that the switching function, K (x, X) , is identically zero over some 
finite time interval. In this case, the hamiltonian H ceases to be an explicit function 
of the control variable u, and the maximum principle is unable to provide information 
about the desired optimal control. The usual procedure of selecting u so as to max- 
imize H breaks down. In some instances it is possible to show that the condition 
K(x,X) = 0 violates the constraints of the system (see the latter part of Example 2). 
Thus no further analysis is necessary; i. e. , the solution is not singular. 


t Actually, the discussion that follows is applicable to any function of u as long as 
the form of Eq. (92) remains. 
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In what follows, we will investigate the nature of singular solutions based on the 
work of Johnson and Gibson ( 176 ). It may be shown thatt 

H* = I + u* K = Max (I + u K) = 0 (93) 

u 


0 < t < t^ 


Therefore the condition 


I (X,x) = K (X, x) = 0 


(94) 


corresponds to the case of singular control. The problem is simplified materially if 
the above condition reduces to a relation which is independent of X. To do this, we 
may use the equations 


I = I = I = =0 

K = K = K = = 0 


(95) 


which follow directly from (94). If we now substitute* 


x. 

1 




(96) 


(97) 


into (95) , then it may happen that the optimal control in the singular region reduces to 
a function involving the state variables only; viz., 

u* = u* (x) 
s s 


A result of this type is obtained in Example 2. This is not always possible, 
especially for high-order systems. However, when a reduction of this type can be 
achieved , the optimal control in the singular region is available as a function of the 
current state of the system. 

* Here , and in the remainder of the monograph, the asterisk superscript will be used 
to denote the optimal value . 

tlhe subscript s on u* is used to denote the optimal control in the singular region. 
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For a more detailed discussion of the problem, the reader is referred to Johnson 
and Gibson. ( 1 ^ 6 ) 


The application of the maximum principle to problems of optimal control may be 
summarized as follows. 

a. Form the hamiltonian, Eq. (79). 

b. Derive the optimal control, u*(t), by choosing that u(t) which (subject to the 
control constraints) maximizes the hamiltonian. 

c. Investigate possible singular solutions. 

d. Determine the optimal trajectory by integrating the 2n equations (80) and (81) 
using the optimal control u*(t) and the appropriate boundary conditions. 

i 


It is instructive to compare the necessary conditions for an optimum, 
with the Euler-Lagrange equations (14). If we write 


Eq. (78), 


<p. = x. - f, = 0 
J J j 


then 




whereupon Eqs. (14) become 


af. 




(98) 


(99) 


( 100 ) 


Comparing with (78), we see that the choice of notation is indeed justified. 

It should be emphasized that the maximum principle provides only a set of necessary 
conditions, much the same way as the Euler Lagrange equations and the Legendre Clebsc 
condition provide only necessary conditions in the variational calculus. Sufficient con- 
ditions are hard to come by (except in the case of linear systems). Nevertheless, this 
is usually a mathematical luxury. In practical situations, physical considerations will 
generally confirm uniqueness and sufficiency. 
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We should note also that the maximum principle yields a solution in the form of 
a set of (in general) nonlinear equations with two-point boundary conditions. Tlius there 
exist the same computational difficulties as with the variational calculus. 

In the final analysis, it must be granted that the maximum principle solves no 
problem that cannot also be solved by variational methods. Its appeal is mainly in 
the elegance of its format and the ease with which it can be applied. This, however, is 
a powerful argument in its favor for purposes of practical application. 

The results obtained thus far yield an optimal control that is a function only of 
time and the initial conditions. In other words, the control is open loop . For most 
control systems it is desirable to have a control which is a function of the current 
state of the system; i.e. , closed loop control. The latter is generally hard to come 
by except in special cases. For linear systems with a quadratic performance function, 
it is possible to obtain an optimal closed-loop control by deriving the Hamilton- 
Jacobi equation of the system. This is done in Sec. 3.2. 1. 

Example 2, The Sounding Rocket : The problem formulated in Example 1 will here be 
solved via the maximum principle. We have 


*2 ' f l 

(101) 

2 

(102) 

3 


-^ f 3 

(103) 


The boundary conditions are 


(t.) = 0 

t. = o 
i 


X 2 <V = ° 

x 2 <V “ 0 

(104) 

x 3 <V = 0 

X 3 <V = m p 


where x^ = h, x g 

= V, x 3 = m 



We seek to maximize the final altitude. Consequently, the criterion function to 
be minimized is 

J = - h f s - x (105) 
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Eq. (89) then supplies the additional boundary condition 

X l( t f ) = 1 (106) 

We might just as easily have sought to maximize hj, in which case we would have 
*1 (tj) = -i. The form above has been adopted in order to permit ready comparison 
with the results of Example 1. 

From Eqs. (78) or (81), we find 


X = — 


1 dD 
x 3 3 x x *2 


X = -X +— X 

2 1 X 3 9 X 2 2 


x = x 

A 3 2 2 

X 3 


The hamiltonian is 


H = X 


-s]-^ 


„ / D \ 

1 

= 

X 1 X 2 " X 2 \T~ + g ) 

+ 0 — 


1 2 2 \ x 3 / 

J L X 3 

This function is maximized if 



B - 8., when K > 0 
M 

8 = 0 when K < 0 

where 



(107) 


(108) 


(109) 


( 110 ) 


( 111 ) 


(112) 
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When K = 0, there is apparently no way to determine the optimal control since 
H is then independent of £ This difficulty may be resolved by employing the methods 
discussed in the section on "Singular Solutions." Using (95), the conditions corres- 
ponding toK = K = I = 0 lead to the three equations, 

X 2 M 

— " X 3 = 0 (113) 



(114) 


(115) 


(116) 


Since x 3 = — /3, this relation yields the value for the optimal control in the singular 
region. In other words, a variable thrust subarc (if it exists) must satisfy (116). 

This result is identical to that obtained in Example 1 by the variational calculus; i.e. , 
Eq. (54). 


The relation (90) for determining tf is superfluous in the present case, since 
physical reasoning indicates that the final subarc is a coasting phase (/3 = 0), and 
the terminal condition is reached when x 2 (t) = 0. However, Eq. (90) is satisfied for 
this value of tf and provides a check on the solution. 

We should now like to investigate the possibility of a variable thrust subarc in 
the absence of aerodynamics; i.e. , D = 0. hi this case, the condition (116) leads to 
mg = 0, an absurdity. Therefore, the optimal control in a vacuum is of the bang- 
bang type . 


Example 3, Minimum Time Control of a Nonlinear Process : Consider the second- 
order nonlinear system described by 


*1 ' *2 * f l 


£ = - u ( u x„ +a u \ = f„ 

2 2 \ 2 2 1 / 2 


(117) 

(118) 


32 



Here or is a positive constant, and the centred functions, Uj and U£, satisfy the 
constraints 


rH 

IV 

vH 

JL 

(119) 

0 < y < u 2 < l 

(120) 

y = prescribed constant 



It is required to determine the form of Uj (t) and u 2 (t) such that the system is 
brought to the equilibrium position, = 0, in minimum time from an arbitrary 

initial position 

x x (tj) = a 

X 2 <V = b 

In other words, the function to be minimized is of the form (75). Consequently, we 
define 



x 3 (t t ) = 0 


The problem now reduces to that of minimizing the quantity 

J - x 3 (t f ) 


The components of the costate vector, X, are given by Eq. (78); viz- , 

* 1-0 

*2 = - V U 2 2X 2 


X 3 " ° 


( 121 ) 

( 122 ) 

(123) 
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The hamiltonian is therefore expressed as 

H " X l X 2’ X 2 U 2( U 2 X 2 + “°l) + >1 3 
However, by Eq. (89), 

X 3 (t f) " - 1 

This relation, in conjunction with (123) shows that 

x 3 (t) - -1 

which means that (124) reduces to 


H = X l X 2- X 2 U 2(' , 2 X 2 +aU l)- 1 
It is immediately evident that H is maximized byt 
u 2 = ~ sgn X 2 

If we write Eq. (125) in the form 


H = -l + X lX 2 + X 2 


r/a U^2 

2 x„ , 


or u.\2 


" X 2 U 2 + 2x 


;) 


then we find that H is maximized if u 2 satisfies the following: 

1. For X 2 < 0 (u 1 = 1) 

a. Ifx 2 >0 
then u 2 = 1 


b. If x 2 <0, then 
u 2 = 1 if |p| > 1 

= p ify < Ip | < i 
= y if IpI < y 


(124) 


(125) 


(126) 


(127) 


tsgn y = 1 if y > 0 
= -1 if y < 0 
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2. For X g > 0 (u x = -1) 
a. Ifx 2 <0 


where 


then u 2 = 1 


b. tfx 2 >0,then 
u 2 = 1 if P > 1 


= p ify<p<l 
= y if p<y 


(128) 


p = 


a 

2 x„ 


Conditions (126) and (128), together with the differential equations (117), (118), 
(121), and (122) and boundary values 


Xj (t.) = a 

x i <V 

x „ (t.) = b 
2 ' i 

x 2 <V 

with the added constraint 



= 0 


ri X 2 _X 2 U 2( U 2 X 2 +0£U l)l w = 

t=t„ 


obtained from (91), are sufficient to determine the optimal u^ and Ug. 

For systems of moderately high order, the corresponding computational tasks 
are not at all trivial, even with computer assistance. In the present case, however, 
the phase-plane representation may be utilized to advantage in order to exhibit the 
salient features of the solution. 

We observe parenthetically that the fundamental difficulty in starting a solution is 
that the sign of X 2 is not known initially. However, the form of X^ and X 2 is known, 
since Eqs. (121) and (122) are readily integrated; viz. , 
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where 


A - / Uj 2 dt 

and ^ and /? 2 are constants of integration. The significant information available from 
an inspection of Eq. (129) is that X 2 cannot change sign more than once. 

Now for purposes of phase-plane representation, we express Eqs. (117) and 
(118) as 


dx i = 


x„ d x„ 
2 2 


“ 2 ( U 2 X 2 + “ U l) 


From condition (126), we know that u| = Tl, while (128) shows that 


(130) 


u* = 1 whenever |x„ | s —• 

regardless of the sign of X 2 . Consequently, in the phase plane, only two trajectories 
can enter the origin. One trajectory, (see Fig. 4), is obtained from the solution 
of Eq. (130) with u^ = u| = 1. The other trajectory, , is given by the solution of 
(130) with uj = -1 and u| = 1. We solve Eq. (130) by working backwards; that is, we 
take xj(0) = x 2 (0) = 0. 

The resulting equations are 


For L 


For L. 


V 0 ' X 2 >0 

X 1 " - x 2 - aI ” 


a 

x„ +a 
2 


X 2 > 0 ,x 2 <0 


x = - x + aln 

1 Lt 


a 

a - x„ 


We now observe, for instance, that a trajectory cannot leave unless X 2 changes 
sign. Assume this happens at point (T). Then u^ switches sign and becomes -1, 
while u 2 takes the value y, the trajectory continuing as shown by the path (T) - (J) - (5). 
At point (2), on line L 2 , u 2 switches to 
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Figure 4. Optimum Trajectories in Phase Plane 
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▲ 



Figure 5. Optimal Control Regions in Phase Plane 
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u 




satisfying this relation until the trajectory reaches point (3), after which u| = 1. 

Continuing the analysis along similar lines , we see that the phase plane is divided 
into six semi-infinite regions as shown in Fig. 5. In the half plane to the left of Lj - 
, the control function, u* = -1, while in the other half plane, u^ = 1. Since, as 
shown earlier, Ag can change signs only once, a trajectory, once it hits , will 

remain on it until reaching the origin. Several typical trajectories from aribtrary 
initial points are sketched in Fig. 5. 

A complete solution for the case where 

X 1 (t i> = ~ 5 * 45 * Q = 1 - 2 
x 2 (t.) = 0 , y = 0.5 

is shown in Figs. 6 and 7. 

3.1.3 Gradient Method 

The earliest studies in optimal flight control systems employed the classical 
variational calculus, since this was the only tool available for the problem at hand. 
While many important and significant results were obtained, it became apparent that 
computational difficulties associated with the solution of nonlinear two-point boundary 
value problems (generated by the variational approach) severely limited the usefulness 
of the method. Since many problems of interest were of sufficient complexity to make 
the computational difficulties overwhelming, it was natural to seek new ways of solving 
them. In this respect, one of the more important concepts that emerged was the so- 
called "gradient method." The main virtue of this approach is that the numerical 
solution reduces to that of solving an initial value problem rather than a two-point 
boundary type. This enhances the computational aspects no end. 

The main ideas of the method evolved in stages rather than appearing overnight 
in fully developed form. It appears that the fundamental concept is due to C our ant. ^**1) 
A systematic development specifically oriented for aerospace applications is contained 
in the works of Kelley^ 3, and Bryson^®’ Various refinements and related 
techniques were developed by Ho( 135 ), Knapp( 139 ), DreyfusU 52 ), and Jazwinski.^ 133 ) 

As in Sections 3. 1. 1 and 3.1.2, we will present only the main results reflecting 
the current state of the art. Proceeding from first principles in each area requires 
that a fairly elaborate groundwork be laid, which entails a task beyond the scope of 
this monograph. It will be shown later (Sec. 3.1.5) that the aforementioned results 
are quite readily derived from a more general point of view within the framework of 
the theory of dynamic programming. ( 16 ) The chief merit of this approach lies in its 
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conceptual plausibility and intuitive appeal. The Euler Lagrange equations and the 
maximum principle are found to be necessary consequences of the principle of 
optimality. However, in order to discuss the gradient method in this context, it 
is convenient to provide a motivational framework prior to presenting the main results. 

The basic idea of the gradient method may be made plausible by considering an 
analogous problem in ordinary calculus. 

Steepest Descent in Ordinary Calculus 

Consider a scalar function of n variables 

f = f (x) (131) 

where, as before, x is an n vector. It is required to determine the vector x(°) such 
that the function f is stationary; i.e. , is either a maximum or a minimum. 

A first-order change in f is given by 


df = £4L d x. = 


3 x. j 
3=1 3 


(V x *) T dx 


(132) 


where 


V f = 
x 


af 




af 

ax 


(133) 


which is the usual definition for the gradient of a function. It is known that the gradient 
is the directional derivative in the direction of greatest change. Therefore, for a 
prescribed increment, dx, the greatest change, df, is produced if dx is taken in the 
gradient direction , viz . , 


dx = K V f 
x 


(134) 


where K is a constant that is chosen negative if a decrease in the value of f is desired 
and positive if an increase is desired. 
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With this choice of dx, the change in f becomest 


df = K(v x f) 


(135) 


Note that since Eq. (132) is a linear approximation, dx must be "sufficiently small" 
to ensure that the linearity restrictions are not violated. 

Suppose now that x^ 1 ) is a first guess for the minimizing vector. The function, 
(131), takes the value f (xW). We now seek to apply a fixed increment, dx, to x(l) 
such that the corresponding decrease in f (xC 1 )) is a maximum. As noted above, this 
is achieved by taking dx in the negative gradient direction; i.e. , taking as an im- 
proved estimate 


x< 2) = x (1 > - K V f 
x 


(136) 


K = positive constant 
The value of K must be chosen such that 




(137) 


If this relation is not satisfied, it implies that linearization was violated, and a 
smaller value of K must be used. The process is continued until 


|f(x (rtl) )-f(x <r >) 


<€ 


(137) 


small preassigned positive constant 


The vector x( r ) is then the one that minimizes f (x) to the desired degree of ac- 
curacy. If x<°> is the exact minimizing vector, then 


f x<°> = 0 


(138) 


It should be emphasized that this approach (like the variational one) ensures only 
a local extremum. In practice, global optimality is generally verified by physical 
considerations . 


tFor any vector, a, a 2 s a^ a. 
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The nature of the gradient method suggests the picturesque analogy of descending 
a hill in some efficient manner. At any given point, one determines the steepest slope 
and then proceeds a short distance in that direction. 

This procedure, in the context of ordinary calculus, deals with incrementing 
variables in order to optimize a function. In the variational calculus, one seeks to 
optimize the function of a function , and here we must vary a function rather than a 
variable. As a result, we are confronted with all sorts of disagreeable complications. 
Nevertheless, the basic principle still applies and yields a powerful tool for optimi- 
zation problems in aerospace guidance and controls. It will be shown that the gradient 
format for variational problems is conceptually identical with that for the elementary 
calculus, except that Green's functions replace the partial derivatives and the vanishing 
of the gradient is intimately related to the conditions expressed by the Euler Lagrange 
equations. These ideas will be developed next. 

Steepest Descent in Function Space 


The essential features of the gradient method, as applied to variational problems, 
may be exhibited by investigating the following. Given the integral 


I = f L jx (t), x (t), tjdt 


( 139 ) 


it is required to choose the vector, x (t), such that I is minimized (or maximized). 

Note that there is an essential distinction between this and the previous problem. 
Whereas x was an n-dimensional variable before, x (t) is an n-dimensional function now. 


Suppose now that tj and tf are fixed. By taking the variationt of I, we find 
t. 


6 I 


= I 6 L (x, x, 


t) dt 




dt 


■l 


(V X L)^ 6 x + (v^ L)^ 6 xjdt 


( 140 ) 


tSee Appendix B. 
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Since the variation and differential operators are commutative, t 


^< S *> -»(af) 

Eq. (140) becomes 


( 141 ) 


6 I = 


"I 


T d 

<v-> + 


|(VJ U T 8 *1 


(8 *) T i (V. L, 


dt 


However , 


(142) 


r 1 

It [ (V i( L) T 6xJ d t . (V. L) T 6x 


= 0 


(143) 


since 6 x (tf) = 6 x (t^) = 0 because t^ and tf are both specified. Therefore, Eq. (142) 
reduces to 


-/ 


6 1= / (6x) 

‘i 


V x L ' dt < V x L) 


dt 


(144) 


Defining a generalized gradient by 


n .-i 

L J* dt 


J x dt V x v x 


(145) 


we may write 


■'f 

® I = f (6x) T [l] x dt 


(146) 


tSee Appendix B. 
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The variation of I is thus expressed as an inner product in function spacet, com- 
pared with the differential of f, Eq. (132), which is expressed as a conventional inner 
product of finite vectors. In addition, we note the correspondence 


df 

=>6 I 

dx <= 

=> dx (t) 
— d 

V f 1 
v x 

Is? \ 


The operator [ ] x defined by Eq. (145) may thus be identified as the gradient of 
I in function space . Consequently, in analogy with (134), we choose 

fix = k[l] x (147) 

as the variation in x (t) that produces the greatest change in fil. Here, K is a constant 
that is negative if it is desired to decrease I, and positive otherwise. The gradient 
is evaluated along the nominal path , x(l)(t) . The variation in I is therefore expressed 
as 


5 I 



(148) 


in complete analogy with Eq. (135). 

We recall that a necessary condition for f to be an extremum is that the relation 

V x f(x) = 0 (149) 

be satisfied; i.e. , that the gradient vanish at the local extremum. 

Invoking a fundamental theorem of the calculus of variations , ( 98 ) — namely that 6 I 
must vanish if I is stationary — an inspection of Eq. (146) leads to the condition 

[l] x = 0 (150) 

since 6 x is arbitrary between the end points, t. and t^. In other words, a necessary 
condition for the extremum in function space is that the generalized gradient vanish. 


tSee Appendix C. 
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In order to relate this condition to the necessary conditions derived via the con- 
ventional variational approach , we will solve the problem using the methods of Sec . 
3.1.1. To this end we define 


x 


n+1 



L (x, x, t) dt 


and pose the equivalent problem of minimizing 


J 


n+1 


<t f ) 


(151) 


(152) 


subject to the constraint 

<P = x n+1 - L (x, x. t) = 0 


( 153 ) 


and initial condition 


x „ (t.) = 0 
n+1 ' V 


Forming the augmented function! 

F " ‘(vr 1 ) 

we find 


a f 

a x. 

3 


= -X 


a l 
a x. 

3 


a f 

a x. 

3 


= -X 


a l 

a x. 

3 


j = 1, 2, , n 


while 


a f 


Tl 


= x 


n+1 


aF 


a x. 


= o 


n+1 


tSee Eq. (13) et seq. Note thatX as used here is a scalar. 


(154) 
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The Euler Lagrange equations are then 



The last relation shows that X is a constant, which means that the preceding 
equations become 



j = 1, 2, , n 


(155) 


These are the Euler Lagrange equations for the problem at hand. Written as 




(156) 


we find by comparison with (150) that the vanishing of the gradient in function space is 
another way of expressing the Euler Lagrange equations. 

For purposes of numerical computation, the evaluation of the gradient employing 
the operator defined by Eq. (145) involves the calculation of higher-order derivatives. 
This is never an advisable numerical procedure, and other computational devices must 
be sought. It turns out that the use of Green’s functions leads to an efficient algorithm 
for the numerical computation of the gradient. We turn to a discussion of the basic 
ideas involved in this approach. 

Consider the system described by 

x. = f (x, u, t) (157) 

(j = 1, 2, n) 

where suitable boundary values are prescribed, and u is an m-dimensional control 
vector. It is required to determine u(t) such that the function 

J = J [x (t f )J (158) 

is an extremal. As noted in the previous sections, a wide variety of performance 
functions can be expressed in this format. Suppose now that one can select a control 
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function, u(t), that satisfies the given boundary conditions for (157) but that does not 
necessarily optimize (158). We will refer to this control vector and the corresponding 
trajectory as the nominal u(t) and x(t) respectively. From this nominal trajectory, we 
may calculate a value for J. 

Suppose now that we adopt a new control, u(t) + 6u(t), which represents a per- 
turbed value of the nominal control. This will produce a perturbed trajectory, x(t) + 

8 x(t). We seek to determine the form of the perturbation that will produce the greatest 
improvement (increase or decrease, as the case may be) in the performance criterion, 
J . For 6 u and fix sufficiently small , we have , to first order , 

Xj +fi x^ = f (x, u, t) +8 f^ (x, u, t) (159) 

(j = 1, 2, • • • , n) 


By way of the Taylor expansion. 


n a f. 


« fj <x, u, t> ■ J2 s ^ + £ 


m 5 f , 


k=l 


iFl 


d u. 


6 u . 


and noting thatt 


8 = dt < 8 Xj) 


we have, from (159), 


1 af, m a f. 


(j = 1. 2, 


This may be written in vector matrix form as 
d 


dt 


(fix) = A (t) 6 x + B (t) fi u 


6 x (t.) = 0 


.n) 


(160) 


(161) 

(162) 


tSee Appendix B. 
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where 


A(t) 


B(t) 


a 

9£ 2 

axi 

a *n 

aX 2 

3f 2 

\* 2 

a f 

n 

9 X r, 

• n 

a'f 

n 

a x, 

a x„ 

a x 

i 

2 

n 

9f i 

* f l 

* f l 

a u, 

a u„ 

a u 

i 

2 

• m 

• 

9f 2 

9f 2 

• 

• 

• 

• 

• 

^1 

• 

• 

• 

• 

• 

9 . U 2 

• 

• 

• 

• 

• 

• 

• 

m 

• 

• 

» 

• 

• 


a’f 

a’f 

n 

n 

n 

a u. 

a 

a u 

1 

2 

m 


All partial derivatives are evaluated along the nominal trajectory. 


Eq. (161) is a linear nonstationary system whose solution is 


given by (B>5) 



t) B(t) 6 u (t) dt 


where <J>(t,t k ) is the state transition matrix for the system (161). It satisfies the 
relations 

6 x (t) = yb; 6 x (t k ) = b 


(163) 


(164) 


(165) 


(166) 
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4>(t k » t^) = I, the unit matrix 


(168) 


Now 


6 J = j[x (t f ) + 6 x (t f )] - j[x (t f )] 


which reduces to 


fi J = c T fi x (t f ) 


where 


(169) 



Substituting (165) in (169), we have 
tf 

6 J = f (6u) T B T (t) <X> T (t f , t) c dt 


(170) 


(171) 


The evaluation of this expression is awkward, because the behavior of 6J depends 
on <J>(tj,t) as a function of t, its second argument, while cp(tf,t) is normally related to 
current and past state as a function of its first argument; see Eq. (166). This pre- 
dicament may be neatly resolved by employing the theory of adjoint equations. Speci- 
fically, we writet 

X = - A(t) X (172) 


tThe notation anticipates the interpretation of X as the vector Lagrange multiplier. 



which is the vector matrix equation, adjoint to the system (161). If we take the initial 
conditions on this to be 

X (t f ) = c (173) 

where c is given by Eq. (170), then the solution of (172) is given by 

X (t) = tf(t, t f ) c (174) 


where ^(t.y is the state transition matrix for the system (172) and satisfies 


at *(t, y = - A(t)¥(t, t k ) 

(175) 

■$r (t^, y = 1, the unit matrix 

(176) 


Notice that Eq. (174) is integrated "backwards" from the given final conditions. 

A(t) is therefore a known function (which, of course, depends implicitly on the reference 
trajectory) . 

Now it is known^ 155 ) from the theory of adjoint equations that 

<f> T (t f , t) = *(t, t f ) (177) 


which means that (174) may be expressed as 


X(t) = 4> (t f , t) c 


Substituting this in Eq. (171), we find 
-t f 

6J = / (6u) T B T (t) A (t) dt 


-{ 


If we define 


T 

H = f A 


then 


8H -v H = B T x 


?(U U 


(178) 


(179) 


(180) 


(181) 
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Therefore, Eq. (179) becomes 


6 J = 



(182) 


This now has the form of an inner product in function space, and, in the light of 
the previous discussions, we interpret (a H/du) as the gradient in function space with 
respect to the control vector, u(t). Since the latter is arbitrary, the condition for the 
vanishing of flj (which thereby ensures the J is an extremal) is that the gradient vanish; 
viz. , 



(183) 


It will be shown later (Sec. 3.1.5) that this condition is precisely the maximum 
principle discussed in the previous section. Thus H, as defined by Eq. (180), is 
indeed the hamiltonian of the system, and Xis the vector Lagrange multiplier. We 
note also that the quantities, a H/ au^, are the Green's functions (sometimes called 
"influence functions") for the system; these indicate how small perturbations, fin. 
affect the performance function, J. 


It now follows that a maximum change in fij is obtained when 


6 u = K v u H 


(184) 


i.e. , the change in 6 u is along its gradient. 

K is taken negative if a decrease in J is sought, and positive otherwise. As a 
result. 
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= K 


i: 


(V u H) 2 dt 


(185) 


in the same manner as before . 

hi order to limit the variation 6 u, it is convenient to introduce the constraint 



dt 



(186) 
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where e is a prescribed constant. Therefore, in order to extremize 6J subject to the 
constraint (186), we introduce an additional Lagrange multiplier, p, and write 


6 J 


’ f f <«") T (|f) dt+ l‘[e 2 - ^ £ (Su) 2 dt] 

l. ^ l. 

1 1 

= X f(6u)T [(^)^ 6u ] dt+ ^ e2 


(187) 


For an extremal 6J, the first variation must vanish; i.e. , 


6 (6 J) 




= 0 


which leads to 

, 1 /9H\ 1 

6u = — ( j = — v H 

2/i \9 u / 2p u 

where the Lagrange multiplier , p , is evaluated from 



(188) 


(189) 


The constant, p , may be identified with constant K in Eq. (184). 

In principle, the procedure is now straightforward. One selects a nominal u(t) 
that satisfies the boundary conditions for the system and that therefore yields a 
nominal trajectory, x(t), along with some value for the performance criterion, J. In 
order to decrease (or increase, as the case may be) this value of J, one uses a per- 
turbed control function, u(t) + fiu(t), where 6 u(t) is selected in accordance with Eq. 
(184) or (188). One then determines a new trajectory and a new value for J that is 
improved (hopefully) over the preceding one. This procedure is continued until the 
improvements in J are less than some predetermined amount, or when 6 J -»0. 

In practice, a variety of complications preclude the formulation of a simple cook- 
book procedure that will always work. Thus far , these factors have been touched upon 
only lightly or not at all. Since a successful application of the gradient method depends 
upon a full understanding of the various subtleties involved and of how to resolve them 
in particular applications, we must consider these in a little greater detail. 
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The Problem of Constraints 

Having provided a degree of motivation for the gradient technique, it remains to 
investigate the manner in which various realistic constraints are accounted for in the 
general approach. Three main types of constraint occur in physical applications. 


1. Terminal 

^a[ x( V’ t f J = 0 < 190 > 

a = 1, 2, ••••, p 

(t f ), t f ] < 0 (191) 

0 = p+1, P+2, , q 


2. State Variable 
g k (x, u, t) < 0 

3. Control 


(192) 


k 


1 , 2 , 


r 


l = 1, 2, , m 


(193) 


The final time, tj, is generally free (unspecified), and the initial conditions, Xj(t.), 
may be specified or left open to optimization. 


The simplest type of constraint, as far as the gradient method is concerned, is the 
inequality constraint, (193), on the control variables. These are accounted for in the 
computational sequence as follows.^®) Assume that the nominal control vector u nom (t) 
satisfies (193) for all t.t The gradient technique then generates a new control 


new 

u 


(t) 


old 

= u 


(t) +6u (t) 


and it is observed whether at each instant of time, the relation 


(/= 1, 2, •••• , m) 

tSince the numerical procedure must of necessity deal with discrete instants of time, 
U nom (t) satisfies (193) at all discrete time instants involved in the computation. 
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is satisfied. If it is, we then use u new (t) to generate a new trajectory in the manner 
previously discussed. If not, we "trim" the control as follows. 


new new „ . 

llu / < 


and 


new 


new 


U X (t k> = V l if ^ 


To take account of the state variable constraint, (192), we define new state 
variables 


-l: 


X n+k (t) = / (x * u * dt 


with 


G (x, u, t) = 0 if g < 0 


■ g k i£g k >0 


and add 


k= 1, 2, ••••, r 


k = 1, 2, •••*, r 


(194) 


(195) 


X n + k = G k < X > U ’ *> 

(k = 1, 2, •••♦, r) 

to the other state equations, together with the initial conditions 

Vk (t i) = 0 

k = 1, 2, • • • • , r 


(196) 


(197) 


We note that the derivatives of G k are everywhere continuous if the derivatives of 
g k are continuous. ( 153 ) 
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As a result of this operation, the state variable inequality constraint, (192), is 
equivalent to 

x n +k <V = 0 < 198 > 

k = 1, 2, •••*, r 

which is in the form of the terminal equality constraint, (190). 

Consequently, we will dismiss state variable inequality constraints from further 
consideration , assuming that they have been transformed to the equivalent constraint, 
(190). 

The problems that now remain are the treatment of the terminal constraints, (190) 
and (191), the selection of appropriate step sizes and proportionality constants, and 
the finding of a nominal control vector that yields a trajectory satisfying prescribed 
terminal and initial values. These will be discussed within the context of the general 
solution, which is considered next. 

The General Solution 


Given the system 

x = f (x, u, t) (199) 

x s n-dimensional state vector 
u = m-dimensional control vector 
with the initial conditions 

x. (ty = aj (200) 

j = 1, 2, ••••, s (<n) 

where the (n-s) initial conditions, which are not specified, are left open to optimization. 
At the final time , tj , the constraints 

*a [ x { V* *f] = 0 < 201 > 

« = 1, 2, ••**, p 

^[ X ( V’ *f J <° ( 202 > 

P = P+1. P+2, • • • • , q 

must be satisfied. In general, t^ is unspecified. 
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The components of the control vector must satisfy the inequality constraints 

V l < < u i (203) 

i = 1, 2, • •••, m 

It is assumed that constraints on the state variable, as given by (192), have been 
transformed to type (190) by the introduction of additional state variables in the manner 
indicated previously. 

We now seek to determine the control vector, u(t), that minimizes (maximizes) 
the performance index 

J = jfx (t f ), t f ] (204) 


such that the constraints, (201) through (203), are satisfied, given the initial conditions, 

( 200 ). 

The problem thus formulated is of wide generality. Many realistic optimization 
problems in aerospace guidance and control can be expressed in this format. 


In presenting the solution to the problem, we will limit the discussion to the main 
results. The complete derivations are quite lengthy, requiring many digressions that 
are beyond the scope of the present monograph. For a detailed treatment, the reader 

. .. .. .. , (3,42,55,75,76,103,135,153) 

should consult the appropriate references. 


Suppose now that we arbitrarily select a nominal u(t) that satisfies constraint (203) 
and that thereby yields a nominal trajectory, x(t), satisfying the initial conditions (200) 
but not necessarily all the terminal constraints, (201) and (202). 


The problem is now taken as one of minimizing the function! 

q 


P - K 0 J + ^ ‘yry 
7=1 


£ V, 


(205) 


where 


= -1 if J is to be maximized 
= +1 if J is to be minimized 


tThe values of J and tf) 

7 


are, of course, obtained from the nominal trajectory. 
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K = 0 if the nominal trajectory satisfies the terminal constraint, # y , 
given by (201) or (202). 

The basic idea here is that if the K y are large (compared with K 0 ), any control, 
u(t), that seriously violates terminal conditions (201) and (202) will give rise to large 
values of K y 0 y 2 , with the result that the next iteration on u(t) will be biased in favor 
of satisfying the terminal constraints rather than optimizing J . Consequently, the 
more closely the terminal constraints are satisfied, the smaller the value of K y «/) y 2 
(whatever the size of K y ) , with the result that subsequent iterations on u(t) become 
biased in favor of optimizing J. This notion, which is intuitively plausible, is also 
mathematically legitimate^ 156 > 157 ) 

Now since the final time, t f> is free, one of the ij) y =0 may be singled out as a 
stopping condition — and is therefore not included in the summation term of Eq. (205) . 
Call this stopping condition 

(t f ), tjJ = 0 (206) 

At each pass, the integration of the trajectory equations is stopped when £1 = 0. 

Defining the hamiltonian as before , 

H = f T X (207) 


where A satisfies 

A = -A(t) A 

with A(t) given by (163), we take, for the initial conditions on A, 



Then, if the variation in the control vector is constrained by 

f h £ w x <t)[Su x (t)J 2 dt = e 2 


(208) 


(209) 


(210) 


€ = prescribed constant 
w^(t) > 0, a weighting factor 
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we take , as a new control , 


new old/xxj.* 

u (t) = u (t) +0u (t) 

where u°^(t) represents the original nominal control and 

6 u (t) = -1— — 

V' 2 fi w^(t) 9u x 

l — 1, 2, • • • • , m 

where p is an additional Lagrange multiplier calculated from 



(211) 


( 212 ) 


(213) 


The weighting functions, w.(t), are introduced to permit a greater flexibility in 
the numerical convergence procedure, especially in regions of high sensitivity. For 
initial trials, one may take w^(t) = 1 for all l . 

Now by defining 


V 


2 



(214) 


we may write Eq. (213) as 


2 

V 



2 

c 


(215) 


Using this to eliminate n from (212), the latter becomes 



€ a h 
r?w x a Ujg 


(216) 


It can be shown that the change in P is then given by(^®®) 

dP = et7 (217) 

which means that € must be chosen negative, since a minimum of P is sought. 
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The (n-s) Initial conditions, which are unspecified, are now modified according 


d VV --^ <218) 

j = 8+1, s+2, • * • • , n 

It is noted that the Xj (t^) are the Green's functions for the initial conditions. 

The general computation procedure is now summarized as follows. 

1. Select a nominal control vector, u(t), and integrate the equations of motion, 
(199), using the initial conditions, (200). If only s of the n initial conditions 
are given, choose arbitrary values for the (n-s) unspecified initial conditions. 
These will then be optimized in the course of the computational procedure. 

The integration is terminated when condition (206) is satisfied. This yields 
the nominal trajectory and the nominal terminal time, tj. In general, not all 
the terminal constraints, (201) and (202), will be satisfied. 

2. Calculate P from Eq. (205). 

3. Integrate the adjoint equations, (208), "backwards," using the terminal con- 
ditions given by Eq. (209). Note that the matrix, A(t), is defined by Eq. (163). 

4. Calculate the hamiltonian, Eq. (207), and the Green's functions, dH/du^. 

5. Choosing some appropriate values for the weighting functions, w^(t), evaluate 
7) 2 by Eq. (214). 

6. Using an appropriate step size, c (which must be a negative number), cal- 
culate 6 Ujj , with l = 1, 2 , • • • • , m. The resulting u new = u°^ + 6u must be 
"trimmed" to satisfy the inequality constraints, (203). 

7. Using u new (t), obtain a new trajectory in the manner of step 1. The unspecified 
initial conditions are modified in accordance with Eq. (218). 

8. Calculate P from Eq. (205), and check whether this is less than the value of 

P from the previous iteration. If not, reduce the size of c and return to step 6. 

9. Stop the iteration when some suitable accuracy criterion is satisfied. This 
may take the form 

2 

T) < a 
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U | < p 

lr cr < r a 


v° 


a = 1, 2, , p 


/3 = 1, 2, •••*, q 


where crandp^ are prescribed constants. 

Remark : Various arbitrary constants must be chosen in the computational procedure 

described above. The values for these constants will generally depend on 
the particular problem under investigation, and no one particular set of 
values is best in all cases. Nevertheless, certain guidelines are generally 
valid. First, as far as the Ky of Eq. (205) are concerned, a large value 
will yield a new trajectory that will tend to satisfy the terminal constraints 
to a greater extent than it optimizes the criterion function, J. In short, the 
early iterations will seek to yield an admissible trajectory that satisfies all 
the terminal constraints before optimizing J . The value of Ky may be 
periodically reduced in the course of the computation. This process in- 
dicates a willingness to accept mild excursions from the terminal con- 
straints in favor of optimizing J, which is the primary goal. 


The proper step size e is difficult to determine initially, since it 
reflects the permissible linearity range. Too low a value means long com- 
puting time, while too large a value violates linearity and yields inac- 
curate results. A few preliminary trials should yield an acceptable range 
of values . 


The weighting functions , w^ (t) , are introduced mainly to afford a 
greater degree of flexibility in the convergence procedure. Normally one 
may take Wjj (t) = 1 for all A and t. It may happen that near the optimum, 
large changes in P result from even small step sizes, e . In this region, 
the w A (t) may be manipulated so that the 6 u^ may be varied with time. 

This sometimes produces a smoother convergence to the optimum. 

Example 4 : We will now investigate the problem of maximizing the payload for a single- 
stage boost vehicle that is required to attain a prescribed altitude, velocity, and flight 
path angle. It is assumed that: the vehicle trajectory is contained in a plane passing 
through the center of the earth; the earth is spherical; and the gravity forces obey the 
inverse square law. In this case, the motion is described by (see Fig. 8) 

mV = t cos a. - D - mg sin y (219) 
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Figure 8. Coordinate System for Example 4 





The symbols have the following meaning. 

C = speed of sound; a known function of altitude 
D = drag; a known function of Mach number and angle of attack 
g = gravity acceleration; a known function of altitude 
= gravity acceleration at earth's surface 
h = altitude 

L s lift; a known function of Mach number and angle of attack 
m e instantaneous mass of vehicle 
M = Mach number = v/c 
R = range measured along earth's surface 
r Q = radius of earth 
T = thrust 
V = velocity 

V = velocity of exhaust gases in rocket engine; a known constant 
© 

a s angle of attack 

ft = thrust variation parameter 

y = flight path angle 

<p = angle of earth's polar axis with perpendicular to plane of motion 
E 

to = angular velocity of earth about polar axis 

hi 

The control variables are /3 and a; i.e. , the thrust magnitude and orientation. t 
These are required to satisfy the inequality constraints 

a <o;<a (226) 

0<£</3 m (227) 

It is also required to limit the axial and normal loads as follows. 

(T + L sin a - D cos a) /m g < L (228) 

A 

t Short-period dynamics are neglected; therefore the thrust angle and angle of attack are 
equivalent. 
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(L cos a + D sin a) / m g < Ljj 


(229) 


where and are prescribed constants. 

The following boundary conditions are specified. 


V (t t ) = 0 


y (t t ) = ir/2 

V (t f) - V T 

R (t t ) = 0 

y <v = y T 

h (t t ) = 0 

h (t f ) = h T 

m (ty = 


ct* 

II 

O 

t f = free 


We may therefore take, as a stopping condition, 

n = h (t f ) - h T = o 


(230) 


(231) 


and the remaining terminal constraints become 


01 s V <t f ) - v T = 0 


0 2 = v (t f ) -y T = o 


(232) 

(233) 


The i nequali ty constraints, (228) and (229), involve state variables. We therefore define 

(234) 


6 


-fS 


dt 


-( 


G 2 dt 


(235) 
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where 


( 236 ) 


(237) 

(238) 

(239) 

(240) 

(241) 

(242) 

(243) 

ity constraints, (228) and (229), are 

(244) 


( 245 ) 



u 2 -fi 


X 2 =y 
x 3 = h 

x. = R 
4 


x_ = m 
0 


then the problem may be expressed in the standard format as follows. 
Given the system 


II 

a* 

to* 

cos u l - D ^x^ • x 3> U 1 

Ul 

f r o y 


X 5 

g 0 ' 

VO +X 3/ 

H 

to* 

sinu i + L(x 1 . x 3 , uj 

_ g o 

/ r » Y 


*7*2 

X 1 

vo + x 3 / 


cos x_ 


x cos x 

r + x„ 
0 3 


-2 ^cos^e . f 2 


x„ = x, sin x„ - f„ 
3 12 3 




cos x„ = f . 
2 4 


*5 ' - U 2 * f 5 


*6 ' G 1 ' f 6 


*7 = °2 " *7 


where 


G = 0 if g x < 0 

2 

= S ! if g i > 0 


( 246 ) 

( 247 ) 

( 248 ) 

( 249 ) 

( 250 ) 

( 251 ) 

( 252 ) 
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G„ = 


0 if g 2 < 0 


= g 2 if g 2 >0 


g l = 


' v e U 2 + L ( x r X 3- U l) Sin U 1 ~ D ( x l> X 3* U l) 


X 5 g 0 r 0 


( X l ,X 3 >U l) C0SU l +D ( X l' X 3 ,U l) SinU l 
X 5 g 0 "o' 




( r 0 tX 3 ) 2 - L N 


with the terminal constraints 

*1 ' W- V T “ 0 
*2 = *2 <V - y T " 0 
*Z * *6 <V * 0 

*4 = X 7 <¥ = 0 

stopping condition 

X 3 ( V- h T = ° 

and initial conditions 

X 1 <V = ° X 5 <V = “i 

X 2 ( V = ff/2 X 6 <V = ° 

X 3 <V = ° X 7 <V = ° 

x 4 <y = 0 

It is required to determine u^t) and u 2 (t) such that 
J = x 3 (t f ) 

is a maximum, subject to the control variable constraints 


(253) 

(254) 

(255) 

(256) 

(257) 


A 


> 


J 


(258) 


(259) 
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(260) 


0 < u 2 < 0 M (261) 

The computational procedure now continues in the manner outlined previously. 

3.1.4 Dynamic Programming 

The theory of dynamic progr ammin g^^) is one of those rarities — a genuinely 
original contribution to a long outstanding problem. As is often the case with the 
appearance of a new idea, a flood of papers dealing with extensions, ramifications, 
and applications has resulted. The theory has helped influence the development of such 
diverse areas as management science, information theory, sequential analysis, optimum 
filtering, adaptive control, learning theory, optimal trajectories, and variational cal- 
culus. The passage of time, however, has had a sobering effect on some of the overly 
optimistic claims made in the early stages of the theory's development. Space limita- 
tions preclude discussion of all facets of the theory save those immediately relevant 
to this monograph. We shall content ourselves with the barest outline of the main 
ideas and discuss a variety of applications that hopefully will clarify the method and 
exhibit its utility. In the next section we will show how the basic premise of dynamic 
programming includes, as a special case, each of the optimization techniques thus far 
considered. 


Of fundamental concern in what follows will be the concept of a multistage decision 
process . At any one stage in this process, one may make a decision (select a control 
function), following which the next stage is reached. Successive stages are related by 
a known transformation . Each stage is characterized by a state vector , and each 
decision results in some cost . These ideas may be made explicit as follows. 

Let the state of a dynamic process be characterized by a state vector, x. Suppose 
that successive states are related by 



where u^) is a decision (control) vector, and T is a known transformation. Eq. (262) 
is a statement of the fact that when in stage (state) j, a control u^) is chosen, and then 
the new stage x^ +1 ) is reached, depending on the explicit form of T fc^), u^)j. As- 
sume there are N stages in the process, and a performance index (cost) is given by 


j - £ »«’) 

j=i ' ' 


(263) 
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We pose the problem of choosing the control sequence, u^, • • • • , u^, such 

that the function J is a min imum (maximum). For ease of discussion, let us call any 
permissible control sequence, u^), u^), ••••, u^\ a policy. If the initial state of 
the process is characterized by the state vector, c, the value of J is obviously a 
function of c and the policy (for a given value of N). Let us now define 

W^(c) = the value of J when the process starts in state c, having N stages 
1 to go, and using an optimal policy . 

An optimal policy, of course, is one that optimizes the cost function, J. The 
fundamental premise of dynamic programming is embodied in the following. 

The Principle of Optimality: An optimal policy has the property 

that whatever the initial state and the initial decision, the remaining 
decisions must constitute an optimal policy with regard to the state 
resulting from the first decision. 


The principle of optimality is used to derive, for the W.(c), recurrence relations 
that yield both the optimal policy and the value of the optimal cost function. The 
reasoning proceeds as follows. We are in the initial state, c, and have N stages to 
go. We pick an initial value of the control vector, u^ 1 ). As a result, we reach a 
new state, 


x (2) = t ( c , u (1) j (264) 

and incur a cost, g (c , u^ 1 ^, with (N-l) stages left in the process. If we proceed in 
an optimal fashion from this new state, the cost incurred is W N-1 |t ^c.u^VJ, by 
definition. By way of the principle of optimality, we see that for some specific value 
of ud) we incur a cost 


But we wish to choose u^ 1 ) so that it is maximized. t Therefore 

g (e, u<1) ) + W N .i [ T (c, u< X >)]j 


W„ T (c) = Max 

N U (D 


(265) 


tFor definiteness, we consider the problem of maximizing J. The argument, of course, 
holds equally well for minimization. 
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Consider now the function, W-^c); i.e., the optimal value of J with only one stage 
to go. It is easy to see that 

W (c) = Maxg(c, u^J (266) 

1 U (D V ' 

For purposes of computation, one proceeds as follows. Assume that u^ 1 ) may take 
any one of a prescribed finite set of values. Then calculate W 1 (c), using (266), for a 
range of values of c. This yields a table of W^c), each with its corresponding optimal 
u(l) which is then utilized to calculate W 2 (c) via (265) for a range of values of e, along 
with its corresponding optimal u(l), etc. Proceeding in this fashion, we find Wj(c), 
W 2 (c), • • • • , Wjj(c)> together with the optimal policy for each. Note that this is a flood- 
ing technique ; in effect, we solve the specific problem by imbedding it in a class of re- 
lated problems. This is a highly efficient technique but has one major limitation. It is 
what Bellman refers to as the "curse of dimensionality." Suppose that the vector, x 
(or c), has n components, and assume that it is required to evaluate 100 different values 
for each component. Then to store Wj(c) in the computer would require 100 n storage 
cells. Since modern-day computers have on the order of 30,000 storage cells, we see 
that n=3 is a marginal figure. Various means of overcoming this limitation are dis- 
cussed in Refs. 9, 10, 168, 169, and 170. We will digress for a moment to consider 
an elementary application. 

Example 5 : Given the system 

x = x 2 + u (267) 

where x and u are both scalars , determine u(t) such that the function 
t 

/ f 3 

|x -u | dt (268) 

is a minimum; the final time, tj, is given. The control function u(t) must satisfy the 
inequality constraint 

-2 < u < 2 (269) 

This simple, rather contrived problem illustrates the computational procedure 
using dynamic programming and also exhibits both the power and limitations of the 
method. The problem is, first of all, nonlinear, having a cost function that is not 
everywhere continuous, and must also satisfy inequality constraints on the control 
variable. These features introduce distressing complications in the classical approach. 
The dynamic programming approach, however, is completely indifferent to analytic 
aberrations. 
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To apply dynamic programming, the problem must first be cast in the form of a 
multistage decision process, which means that we must use the discrete versions of 
(267) and (268); viz. , 


^x 2 + uj d t 

(270) 

n 

Z > dt 

(271) 

k=l 

N dt 

(272) 


This respresentation is valid as long as the increment, dt, is sufficiently small. 
Successive states are then related by 



(273) 


which is the transformation law, (262), for this problem. 
Now define 


W^(c) = the value of J, Eq. (271), obtained by starting in state c, having 
N stages to go, and using an optimal policy. 


A direct application of the principle of optimality yields the recurrence relation 


W N (c) = Min 
u 



(274) 


This simply means that in state c, with N stages to go, if we choose a particular 
(admissible) u, then we incur a cost |c - u 3 |a for this first stage and end in state 
[c + (c 2 + u)A]. Assuming that we proceed optimally thereafter, the remaining cost 
is, by definition, [c + (c 2 + u)A] for the remaining (N-l) stages. However, we 

choose not an arbitrary u for the first stage, but the one that minimizes this total cost 
Hence the form of Eq. (274). 


It is easy to see that when there is only one stage to go, we have 

W^c) = Min |c-u 3 |a 
u 

and this represents the "initial condition" for the recurrence relation (274). 


(275) 


72 



We now examine the computational sequence in some detail. To do this, we must 
decide 'on the range of interest for the state variables and the number of values of u 
in the range - 2 < u < 2. For present purposes, we will consider a painfully simplified 
version wherein 


c = -5, -4, -3, -2, -1, 0, 1, 2, 3, 4, 5 
u = -2, -1, 0, 1, 2; A = 0.1 


Then via (275) , we calculate the table 


c 


Wj(c) 


-5 


0.3 


-4 0.3 

-3 0.2 

-2 0.1 


-1 

0 


0 

0 


1 


0 


2 

3 

4 


0.1 


0.2 

0.3 


5 


0.3 


u 

-2 

-1 

-1 


-1 


-1 

0 

1 

1 

1 

1 

2 


Normally, in realistic problems, this operation is performed on a computer, and 
the table is printed out. Now we calculate W 2 (c) from 


W (c) = Min 
u 


c - u A + 


W 1 [° + (° 2 + U ) A ] 


for a range of values of c. For example, with c = 3, we calculate the bracketed 
quantity for each of the five permissible values of u; we then select the one that is a 
mini mum as the value of W 2 (3). This yields, for example, 

W (3) =0.50; u = 1 
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Proceeding iteratively, we obtain, ultimately, the tabular form 


c 


W N< C) 


which represents, along with the tabular forms for Wj(c), j = 1,2, • • • • , N-l, the 
solution to the problem. Note that we have the solution, not only for one specific 
initial condition, c, but a whole range of possible initial conditions. This is charac- 
teristic of the dynamic programming method. 


We have thus far considered only "running costs" of the type exemplified by Eq. 
(263). It may be desired to minimize (or maximize) a function of the final state 

J = g[x (t f )j (276) 

In this case, we define 

W N (c) = the value of g |x (t^)j starting in state c, having N stages to go, and 
using an optimal policy. 

A simple argument shows that the recurrence relations are 


W N (C) 

= Min W [T (c, u)l 

(277) 


u 1 J 


W Q (c) 

= g (e) 

(278) 


In other words, with zero stages to go, the optimal W Q (c) is merely the value 
g [x(tf)], where c replaces x(tf). 

For the functional equations thus far considered, it was possible to write explicit 
initial conditions; that is, the number of stages in the process was known beforehand. 
This is not always the case. To examine this situation, consider the problem of driving 
the solution of 
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*2 = h (x v x 2 , 

»1<P) = C- L 
X 2<°> = C 2 

to the equilibrium state, = x 2 = 0, in minimum time, where the control, u^, is 
constrained by 

a < u 1 < b 

We consider the discrete version 


x 1 (t + A) = x x (t) +x 2 (t)A 
x 2 (t +A) = x 2 (t) +h| X]L (t), x 2 (t), u 1 (t) j A 
A = dt 


and instead of requiring that x n and x„ be simultaneously zero, it is sufficient to re- 

Q 2 1 M 

quire that (x-^ + x 2 ) <e , where e is some small positive number. 

If we define 

W (c-p c 2 ) = the time required to reach the equilibrium region (x^ + x 2 2 )< €, 
starting in state (CpC 2 ), and using an optimal policy. 


then a simple application of the principle of optimality yields the functional equation 


W (c . , c 9 ) = Min 
1 & u^ 


A + W [ c i +c 2 A. C 2 +h ( C l’ C 2‘ U l) A 


(279) 


The computational procedure for this equation is not immediately apparent, since 
we are unable to set up "initial" conditions. This equation is of the implicit type; that 
is, the unknown function appears on both sides of the equation. It is not a recurrence 
relation of the type previously considered. 


We may, nevertheless, obtain a solution via the concept of approximation in policy 
space, a powerful technique first discovered by Bellman.' j S crudely analogous 

to and predates the gradient technique discussed in Sec. 3.1.3. One proceeds as fol- 
lows. Select a policy, u^(t) , that brings the system to the equilibrium state, but 
that in general, does not result in minimum time. Let this time duration be denoted 
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by W^°fycpC 2 )f which is a first approximation to the optimal return function, W(CpC 2 ). 
Do this for a range of values of c^ and Cg. Calculate an improved return function, 
W^ 1 )(c 1 ,c 2 ), via 

w<1> (v c s)" Sf 

also for a range of values of c-. and c ot storing the improved policy u.,* 1 ). It can be 
shown* 16 ) that 1 



and the iteration stops when 



T) = small positive constant. 


The number and type of optimization problems that can be solved by the simple 
methods outlined above is truly astonishing. References 16, 17, and 159 contain 
numerous examples, together with an extensive bibliography. We now consider some 
simple, though fairly realistic , aerospace applications. 


A + W 


( 0 ) 


°1 +<! 2 A 


c 2 +h 


(°r V u i) A ] 


Example 6 : We return to the sounding rocket problem already analyzed via the vari- 
ational calculus (Example 1) and the maximum principle (Example 2 ) . In Example 1 
it was found that the optimal thrust control program may contain an ambiguityt for 
certain forms of the drag function. (See Fig. 3.) This ambiguity is completely re- 
solved in the dynamic programming solution. 


We now proceed to express the problem in the format of dynamic programming, 
using the notation defined in Example 1. The state transition equations are written 
in discrete form as follows . 


d V 


0M -D (V. h) 

m 


g (h) 


(280) 


d h = V 


(281) 


A = dt 


tThis ambiguity also exists via the maximum principle. 
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We have to take account of the fact that the control function satisfies the relation 


/3 = - m (282) 

and the constraint 

O<0<0 M (283) 


where 0 m is given. Let us suppose that admissible values of /3 may be taken from the 
q distinct values 


P 



(284) 


where 


P 

P 


0 

q 


= o 

= /3jyj, the permissible limit 


and that 




y, a prescribed positive constant 


Now a particular 0^ corresponds to a particular mass increment (d m)^ determined 
from 


P 


k 


(dm) k 

A 


(285) 


We now select A, y, and q such that 

P k = "k (286) 

where k is a positive integer (or zero) that is interpreted as a number of mass (i.e. , 
fuel) increments. If, at time zero, the total mass is 

m (0) = m Q + m p (0) 

where 


m^ = fixed mass (structure and payload) 

m„ = fuel mass 
F 
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then we find the number of "stages" in the process as 
m p (0) = N 

In other words, the size of the mass increments having been established, a parti- 
cular fuel mass can be expressed as a specified number of these mass increments. In 
particular , we denote the number of mass increments available at the start of the pro- 
cess by N. 

The problem is to determine a thrust control policy that maximizes the final altitude, 
given a prescribed fuel mass. Define 

W N (V,h) = the altitude attained starting with velocity V, at altitude h, having 
available N fuel increments, and using an optimal policy. 

The functional equation is 


w N (V,h) 


Max 


0 , 



Q 

i 

a. 

i 


VA + \V , 

v + — - g I A, h + VA 


N-k 

\ m 0 + N 7 






(287) 


The reasoning is as follows. Starting in state (V,h), we acquire an altitude in- 
crement, Va, and reach a new state 


/ V- p 

\ m 0 +N 



A 


h + VA 


Proceeding optimally thereafter, the altitude acquired is 


W 


N-k 



by definition. Note that there now remain (N-k) stages, since the choice /} , means 
that k fuel increments were expended. But we may choose /S^ to maximize This return. 
Hence the form of Eq. (287). The initial condition, Wq (V,h), is easily obtained, since 
with zero stages to go, the altitude acquired is simply the solution of the system 


V = 


D (V, h) 


m_ 


“ g (h) 


0 
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h = V 


where the initial conditions on V and h are merely the arguments of W 0 (V,h), and the 
integration terminates when V = 0. Denoting the altitude acquired during this coast by 
hj , we see that 

W Q (V, h) = h T (V, h) 

the notation h T (V, h) emphasizing that the altitude acquired during coast is a function 
of the starting values of V and h. If the altitude is sufficiently high such that the drag 
is negligible, then 


W 0 ^ h > = 2 ? 


The computational procedure is now straightforward. Some care must be exer- 
cised, however, in the choice of increments. One must choose A sdt small enough, 
for example, such that (280) and (281) represent good approximations to the continuous 
case. If we consider the numerical values 


m (0) = 1000 slugs 

m (0) = 900 slugs 

F 

H = 2000 ft/sec 

/S M = 25 slugs/sec 


then we may take 


A - dt = 0.1 sec 

£ = jo, 0.1, 0.2, , 25 

mass increment =0.1 slug 


N = 9000 


The solution time for this problem would be on the order of a few minutes on a 
computer of the 7090 type. 

Example 7 : One of the earliest studies dealing with the application of dynamic pro- 
gramming to trajectory optimization is the following. 
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Let the motion of an aircraft in the vertical plane be described by (see Fig. 
mV = T - D (V, h) - mg sin 0 
h = V sin 8 

where 

V = velocity 
D = drag 

T = thrust (constant) 
m = mass (constant) 
g = gravity acceleration 
h = altitude 

0 = angle of thrust vector with respect to horizontal reference 

We pose the problem of programming the thrust vector angle, 8, such that the 
aircraft attains a prescribed velocity and altitude, starting from some given initial 
velocity and altitude , in minimum time . 


9 ) 

( 288 ) 

(289) 



Figure 9. Coordinate System for Example 7 
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Taking the discrete version of (288) and (289), 


dv = [ T-°(V.b) - esln9 ] A 


d h = VA sin 0 
A = dt 


(290) 

(291) 


and defining 


W (V, h) = the time required to reach the prescribed terminal state (V^.hrp) 
starting in state (V,h) and using an optimal policy 


we obtain, via the principle of optimality. 


W (V, h) = A + Min 

0 


V + ■ — ] — - g sin 0 j A, h + VA sin 0 


(292) 


This is an implicit type of functional equation that must be solved iteratively by 
the method of approximation in policy space . 


Viewing the problem in a different light, Cartaino and Dreyfus^ 1 ® 0 ) obtained a re- 
currence relation for W (V,h) for which initial conditions could be prescribed, thereby 
circumventing the need to solve an equation of the explicit type. Their reasoning is 
based on the observation that the given problem is equivalent to that of determining the 
minimum time path in the h-V plane. From Eqs. (290) and (291), we see that 


A = 


tlMh 

(£K t - °) 


(293) 


w = mg 


Imagine now that the h-V plane is subdivided into grids with increments d h and dV. 
We suppose that the mass point representing the airplane moves in discrete steps from 
one node to the next, either straight up (V = constant) or horizontally to the right (h = 
constant). At each node point, a decision must be made whether to proceed vertically 
up or horizontally to the right. The criterion, of course, is that the grid from the 
current (initial) V and h to the prescribed V T and h^, be traversed in minimum time. 
The "costs" associated with the respective alternatives are 
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for constant V 


w dh 
V (T - D) 


wdV 
g (T - D) 


for constant h 


(294) 


via (293). 

Let us note at the outset that the insensitive pragmatist might be tempted to 
enumerate all possible paths and then select the minimum time path by direct com- 
parison. In the case of a p by q grid, the total number of possible paths under the 
conditions stipulated is 


(P + Q)! 

(p - 1) ! (q + 1) ! 


For a grid with p = q = 100, the number of paths to be enumerated is on the order 
of 10 59 . Since this closely approaches the estimated number of atoms in the galaxy, 
one is compelled to seek another approach. 


As a matter of fact, via the principle of optimality, with W (V,h) defined as before, 
we find 


W (V, h) = Min 


. w d h 
' V (T - D) + W 


V, h + dhj 


B. - + w[v +dV, h 

g (T - D) 


(295) 


This is arrived at as follows. If we are at the node in the grid whose coordinates 
are V and h, we have the option of moving vertically upward (constant V), in which 
case we incur the cost 


w dh 
V (T - D) 


(i.e. , this is the time consumed in going from h to h + dh) and arrive at the new grid 
point, (V, h +dh). Proceeding optimally thereafter, the cost is W (V, h +dh). This 
is alternative A. Alternative B is arrived at by similar reasoning. Finally W (V,h) 
is taken as the smaller of the two alternatives A and B. Since the terminal values of 
velocity and altitude, Vrp and h^, are specified, we see that 


W ( V T- dV -S)° g |T- W D d (V T .h T) j 


(296) 
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ALTITUDE (thousands of feet) 



Figure 11. Altitude/HorizontaJ Distance Profile of Minimum Time Path 
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w (v T , h T -dh) 


(297) 


wdh 

v t[ t ' d ( v t' h r)| 

These represent the initial conditions for the recurrence relation (295). 

It is easy to show that the number of paths to be enumerated in this approach is 
on the order of pq, compared with the astronomical number of paths to be enumerated 
by the brute-force approach. 

Figs. 10 and 11, abstracted from Ref. 160, show some typical results. From 
Fig. 10, Eq. (289), and the relation 

x = V cos 8 

x = horizontal distance 

we may plot the optimal path in the h-x plane, which is depicted in Fig. 11. The con- 
stant Mach number and constant altitude subarcs for the optimal path are well delineated 
in Fig. 10. The unique feature of the present approach is that the complexity of the drag 
function presents no difficulty whatever, whereas in the classical approach, one may 
encounter all sorts of analytical obstacles. 

3.1.5 Unifying Principles 

In the methods presented thus far, only the main results have been stated, together 
with various examples illustrating their application. The development of detailed deri- 
vations in each case would require a document the size of a textbook rather than a 
monograph. Nevertheless, the omission of derivations constitutes a pedagogic defi- 
ciency that precludes complete comprehension. Furthermore, a superficial examination 
would seem to indicate that the four optimization techniques considered thus far are 
basically unrelated. It may, in fact, be shown that not only are these techniques inti- 
mately related, but that they all fall out as special cases from a more general point of 
view — namely, they are all necessary consequences of the principle of optimality. 

We proceed to demonstrate this in the following way. 

Let the system dynamics be given byt 


X = f (x, u) 

(298) 

X (tj) = c 

(299) 


tAn explicit dependence of f on t may be removed by defining an additional state 
variable as noted earlier. 
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where, as before, x and u are the state and control vectors respectively. It is re- 
quired to choose u(t) such that a function of the terminal state 

J = j|x (tf)J (300) 

is minimized. t The control vector must satisfy the inequality constraints 

V. < u.. < ^ (301) 

j = 1, 2, • • • • , m 


and a stopping condition is given by 

n[x (t f )] = 0 (302) 

There also exist terminal constraints of the form 

>l> a [x (tf)] = 0 (303) 

a = 1,2, , q (<n) 


As noted earlier, this is a very general format, since a variety of performance 
functions may be transformed to type (300) , while state variable inequality constraints 
may be transformed to terminal constraints by defining additional state variables. 

(See Sec. 3. 1.3.) 


We now express (298) in the discrete form 

dx = f (x, u)A (304) 

x (A) = c + f (x, u) A (305) 

A ; dt 
and define 

W(x) = the value of J, Eq. (300), when the terminal conditions, (302) and 
(303), are satisfied, starting in state x and using an optimal policy. 

Applying the principle of optimality, we obtain the functional equation 


W(x) = Min 
u 


w|x + f (x, u)Aj 


(306) 


tThe arguments that follow are identical if "minimized" is replaced by "maximized." 


85 


Via a Taylor expansion. 


w[x + f (x, u) a] = W (x) + f j < x ’ U)A + 0 (a2 > 


Substituting in Eq. (306) and taking the limit as A-* 0, we obtain 


0 = Min 
u 


a w(x) 
ax 


f (x, u) 


(307) 


This is the partial differential equation that must be satisfied by the optimal return 
function. 


Let us now define 


X. 

J 


aw 

ax. 

] 


(308) 


or , in vector form , 


X 


aw 

ax 


s -v x w 


(309) 


and 


T 

H = X f 


(310) 


The notation anticipates the interpretation 
and hamiltonian respectively. Then from Eq. 


of X and H as the Lagrange multiplier 
(307), 



Min (-H) 
u 


and finally, 

0 = Max H 
u 


(311) 


This states that the control, u, that minimizes J must necessarily maximize H; 
in other words, Eq. (311) is a concise statement of the maximum principle. 
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We may express equation (307) by the two equivalent equations 


aw(x) l 
dx 


J 


f (x, u) = 0 


( 312 ) 



This last equation may be written as 


faw(x)r 

faf (x, u)] 

l J 

[ J 


(313) 


(314) 


if there are no constraints on u. In the development that follows, it will be assumed 
that such a constraint has been removed by defining an additional state variable, to- 
gether with an added differential constraint in the manner discussed in Sec . 3.1.1. 


d raw] = a 2 w dx k 

dt L Sx j J fei dx j 3x k * dt 


n 


5 * iU> 


a pw\ 

ax. \ax k ; 


(315) 


A partial differentiation of Eq. (312) with respect to Xj yields 



0 


Combining this with (315), 



h au e 



0 


(316) 


(317) 


But Eq. (313) can be written as 



or 


^ 3W^k 

(k d \ 8u j 


= 0 


i = 1, 2, • • • • , m 


Substituting this in (316), we find 


iffl.V 5» . 0 

dt ' a y pi ax k 8 *j 


i = i, 2, 


n 


which reduces to 


A + 
J 


E x 

k=l 



k ax. 

J 


j = 1, 2, , n 


(318) 


(319) 


(320) 


by combining with (308). It is easy to show that this is an alternate form of the Euler 
Lagrange equations. 

For this purpose, consider the augmented function defined by Eq. (13); viz. , 


F 



We obtain directly 


3 F 

ax. 

j 


a. 

j 


a f 

Sx. 

J 


n at 

V A, 

Pi k8x i 


j - 1, 2, 


• • • • 


n 


Consequently, the Euler Lagrange equations 


_d_/aF\ _ a F 
dt\ax.j ax. 

r j 


= 0 


j = 1. 2, 
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may be expressed as 



j = 1, 2, , n 


which is completely identical with (320). 

This leads us to interpret Xj as the rate of change of the optimal return function 
with respect to the state variable, Xj. 

We have already shown (in Sec. 3.1.3) that the Euler Lagrange operator 
7i( v *-' v x) 

may be interpreted as a generalized gradient. Furthermore, condition (311) is an 
alternate expression for Eq. (183), wherein (SH/du) is another type of gradient (the 
Green's functions). 

We have shown, therefore, that the basic results embodied in the variational 
calculus, the gradient technique, and the maximum principle are all necessary con- 
sequences of the principle of optimality in dynamic programming. 

3.2 STANDARD SOLUTIONS 

In certain well defined optimization problems, it is possible to obtain closed-form 
solutions. If the format of these problems is sufficiently general to include as special 
cases a wide variety of situations of practical interest, it is tempting to refer to these 
as standard solutions. These results are useful not only because they solve specific 
problems, but also because they may be taken as first approximations to more complex 
problems. This theme will be further developed in Sec. 3.3. 

3.2.1 Linear Problems 

The most tractable optimization problem, from an analytic point of view, is one 
whose dynamics are governed by a set of linear differential equations with a performance 
index of the quadratic type. This is expressed mathematically by 


Ax + Bu 

(321) 

Gx 

(322) 

= c 

(323) 
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where 


x = n-dimensional state vector 
u s m-dimensional control vector 
y = q-dimensional measurement vector 
c s initial condition vector 
J = performance index (cost); a scalar 
A = n X n constant matrix 
B s n X m constant matrix 
G = q x n constant matrix 
Q = q x q constant matrix (symmetric) 

R 2 m x m constant matrix (symmetric) 

M = n x n constant matrix (symmetric) 

The problem is usually stated in the following form. Given the system, (321) and 
(322), with the initial condition (323), determine the control vector, u(t), such that the 
performance index, (324), is a minimum. t 


We define the optimal return function as follows. 

■ t. 


W (x, t) = Min 
u 


£ f (x T G T QGx +u T Ru^dt 


+ x (t{) Mx (t{) 


(325) 


Note that W(x,t) is a function of the current state and the current time. Applying 
the principle of optimality, we find that W(x,t) satisfies the recurrence relation 


W (x, t) = Min 
u 


(x T G T QGx+u^ Ruj A + W^x + dx, t + A^J 


( 326 ) 


As dt 


tThe arguments are completely analogous if "maximum" replaces "minimum." 



Expressing the second term in the brackets by its Taylor expansion, we have 


w/x + dx, t + a) = 


W (x, t) + 


[dW(x, t)l T 


^ nr 


= W (x, t) + 


aw (x, t) | r 

ax J 


(Ax + Bu)A + 


aw 

at 


retaining only first-order terms in a. Substituting this in (326), dividing through by 
A, and letting A— 0, we obtain 


aw 

at 


Min 

u 


/ T T _ T \ 

/aw\ T 

lx G QGx + u Ruj h 

( 17 ) ,Ax + Bu > 


From (325), we see that a boundary value is given by 


W 



T 


x 


(tf) Mx (t f ) 


(327) 


(328) 


Eq. (327) is the Hamilton-Jaeobi equation for the system which could also be de- 
rived from the maximum principle. Despite its formidable appearance, Eq. (327) has 
a closed-form solution that may be readily obtained as follows. From (327), we write 
the two equations. 


x T G T QGx +u T Ru + (-|^) (Ax + Bu) = - 

aw 
’ a it" 

(329) 

±f(x T C T QGx + u T Ru) + (|^) (Ax + BU) 

= 0 

(330) 


Now (329) is satisfied only for that value of u obtained by solving (330). This u, 
when substituted in (329), yields an equation that is completely equivalent to (327). 
Performing the indicated operations in (330) , we find 



2Ru + B 


t aw 


ax 


o 


or finally. 


u 


* = 
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(331) 


We use the superscript ()* to indicate that this is the optimal control function 
(policy), which when substituted in the system (321) yields the optimal trajectory 
x*(t), and is the one which minimizes (or maximizes) the performance function (324). 


Substituting (331) in (329) yields 


aw 

at 


T T 
x G 


QGx + 



Ax - 



BR 


-1 



This equation is equivalent to (327). We assume a solution of the form 
W (x.t) = x T P(t) x 

where P(t) is a symmetric matrix, which is for the moment unknown. Then 


(332) 


(333) 


aw 

ax 


2 P(t) x 


aw 

at 


T • 

x P(t) x 


and Eq. (332) becomes 


-1„T. 


-x Px = x (G QG+2PA-PBR B P x 


(334) 


Since the matrix P is symmetric, it is convenient to have all the terms inside the 
parentheses symmetric; in fact, the only one not symmetric is 2 PA, since A in general 
is not symmetric . It is known, however, that any square matrix may be expressed as 
the sum of a symmetric and a skew-symmetric matrix. In the present case. 


2 PA 


= (pA + A T Pj+(lPA - A T pj 


where 


(PA + A T P 


(PA - A T P 


j = symmetric 
j = skew -symmetric 


Now let 


a = Ab 


where a and b are arbitrary vectors. Then 


T 

a 


T T 
b A 
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and 


b T |pA-A T pjb = b T PAb-b T A T Pb =b T Pa-a T Pb = 0 

In other words, (PA-A^P) is identically zero . Consequently, Eq. (334) becomes 

Tf • T -IT T 1 

x P+PA+A P - PBR B P +G QG x = 0 

This yields the nonhomogeneous matrix Riccati equation 

P + PA + A T P - PBR -1 B T P = -G T QG (335) 

From (328) and (333), we see that 

P(tf) = M (336) 

The solution of Eq. (335) may be obtained in the following way.^® 1 ) Define an 
associated system of linear matrix equations as follows. 


• -1 T 

Y = AY - BR B Z 


• T T 

Z = -G QG Y - A Z 


Then the solution of Eq. (335) is given by 


P = ZY 


-1 


(337) 

(338) 

(339) 


This is easily verified by differentiating the above expression and substituting for 
Y and Z from (337) and (338). Use is made of the relation 

•-1 -1 • -1 
Y = -Y YY 

which follows by differentiating the identity 
Y -1 = Y -1 Y Y _1 


with respect to time. 

Writing Eqs. (337) and (338) in the form 
V - FV 


(340) 
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where 


V s 


F = 


T ! T 

-A [ -G QG 

-1 T 

-BR B ' A 


we find 


Ft 

V (t) = e V(0) 


(341) 


If final rather than initial conditions are known, we write instead 


V(t) = e F(t_tf) V(t f ) 


which is equivalent to (341) if we note that 
e‘ Ftf V(t f ) = V(0) 

Consequently there is no loss of generality in dealing with V(0) as an "initial condition" 
matrix . 

Let Eq. (341) be partitioned as follows. 


Z(t) 


J 



'v 1 ( 0 )‘ 

Y(t)_ 


1 

V' 1 - 


. V 2 (0) . 


Then, using (339), we find 

P(t> = [<P u (t> V x (0) +<p i2 (t) V 2 (0)j|<p 21 (t) V J (0) +<p 22 (t) v 2 (0)] _1 

= |<P n (t) +<p i2 (t) V 0 (0)j |<p 21 (t) +<P 22 ( t) V 0 (0)J -1 (342) 

where V Q (0) replaces V 2 (0)V 1 _1 (0). The matrix V Q (0) is evaluated using the boundary 
condition (336). 
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Equation (342) is the solution of the matrix Riccati equation (335). 

Having P(t), we evaluate the optimal control vector, u*(t), from (331), making use 
of (333); viz. , 


u*(t) = -R 1 B T P(t) x(t) (343) 

An important special case arises when M = 0 and t^ — “ in Eq . (324). In this case , 
W and P do not depend on t . The matrix Riccati equation (335) then reduces to 

PA + A T P - PBR’ 1 B T P = -G T QG (344) 

which is merely an algebraic equation for the determination of P. For large n, the 
computational solution of (344) is not trivial . The recommended procedure is to 
integrate Eq. (342) with an assumed value for Vq(0), and stop when P(t) reaches a 
steady-state value, which is then taken as the desired value of P. The speed of con- 
vergence depends, of course, on the selected value of V Q (0). This, however, is not 
too critical, since convergence is generally very rapid. 

The optimal control vector is now given by 

u*(t) = -K x(t) (345) 

-1 T 

K = R B P, a constant matrix 

In other words, the optimal control is a linear function of the current state . Thus 
the optimal control is expressed as a closed -loop system rather than an open-loop type, 
which has characterized the results obtained in previous sections. 

3.2.2 Norm-Invariant Systems 

Consider the nonlinear system 

X = g (X.t) +u (346) 

where, as before, x is an n-dimensional state vector and u is an n-dimensional control 
vector. The latter satisfies a constraint of the formt 

INI < y (347) 

We pose the problem of determining the control vector, u(t), subject to the above 
constraint, which drives the system from an arbitrary initial state to the "origin," 
x=0 , in minimum time . 


tSee App>endix C for a discussion of norms . 
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We suppose that the system is norm-invariant ; that is, the homogeneous form of 
(346) 


x = g (x, t) (348) 

has the property 

||x(t) || = ||x(0) |f for all t > 0 (349) 

In this case, Eq, (C15) shows that 


T T 



which leads to 

X T g (x,t) = 0 (350) 

The results that follow were first obtained by Athans . His derivations, how- 
ever, relied heavily on function space methods and were not in the "mainstream" of 
optimal control theory. 

A simpler and more direct development is possible by application of the basic 
concepts of dynamic programming as follows. 

Define 


W(x) = the time required to transfer the system from state x to the origin, 
x=0, using an optimal policy. 

Applying the principle of optimality, we obtain 

W(x) = Min 
u 

In words, if we are in state x, then by applying a control, we incur a "cost," 
A=dt, and arrive in the new state, Cx + (g + u)A]. Proceeding optimally thereafter, 
the cost is W [x + (g + u)A] by definition. However, we select u to minimize this 
total cost. Hence the form of Eq. (351). 

Via a Taylor expansion. 


A + W 


|x + (g + u) A J 


(351) 


W 




W (x) + 


pW(x) J T 


(g + u) A 


+ 0 (A 2 ) 
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Substituting this in (351), dividing through by A, and letting A-*0, we find 



( 352 ) 


Minimizing the expression in the brackets is equivalent to minimizing 

(Hr) 1 " < 353 > 

where u is constrained by (347). Since expression (353) is merely the inner product of 
two vectors, one of which is arbitrary and the other of which is constrained in magni- 
tude, it is easy to see that (353) is minimized when 



Equation (352) is therefore equivalent to 



The solution of this equation is 

W(x) = -^-(x T x) = y ||x|| 
as may be verified by direct substitution; viz. , 


1 



which , when substituted in (355) , yields 

_ JL i 1 

0 ~ 1 X J x g_ 7\ x x ; v x x ) y 

The middle term vanishes by virtue of (350). Q.E.D. 


(354) 


(355) 


(356) 
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Using (356), the optimal control vector, u(t), may be written as 


u*(t) = -A Y 


Thus u*(t) is obtained as a function of the current state (feedback principle), and 
the minimum time is obtained from Eq. (356), with x*(t) representing the optimal tra- 
jectory; i.e. , the integration of Eq. (346) using the optimal control, (357). 

A similar analysis shows that the control that minimizes 


A w 


k = positive constant 


while driving the system from an arbitrary state, x, to x = 0, is also given by (357). 
The terminal time, tj, is not fixed in advance. This may be interpreted as a minimum- 
fuel problem. 

It may also be shown, via the same techniques, that the control that minimizes 


f f 

/ (u T u)dt 


while transferring the system from an arbitrary initial state, c, to x = 0, is given by 



x(0) = c 

t^ = prescribed 

The optimal trajectory, x*(t), is the solution of Eq. (346), with u*(t) given by (360). 
Note that now the optimal control depends explicitly on the initial state and the prescribed 
time, tj, whereas previously the optimal control was a function of the current state of 
the system only. This last case, the minimization of the integral, (359), is a type of 
minimum-energy problem . 

The above results are quite significant, since closed-form solutions for nonlinear 
optimization problems are hard to come by. More important, several cases of practical 
interest in aerospace control are indeed characterized by norm-invariant properties. 
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The latter is often a consequence of conservation of momentum. In particular , two 
classes of norm-invariant systems are the following. 


if 


1. All linear systems of the form 

x(t) = A(t) x(t) + u(t) (361) 

l|u(t)|| < y 

A(t) = -A T (t) (362) 

2 . All nonlinear systems of the form 

x(t) = B|x(t), tjx(t) +u(t) (363) 

l|u(t)|| < y 

B[x(t), tj = -B T [x(t), tj (364) 


Example 8 : For purposes of analyzing the short-period motion, a satellite may be as- 
sumed suspended in a force -free field. Let Ij, Ig, and Ig denote the three moments of 
inertia of the body about the principal axes that pass through the center of mass. Also, 
denote by Xj_, x 2 , and Xg the three angular velocities about the axes, 1, 2, and 3 re- 
spectively. The motion is then described by the Euler equations 


'l *1 = ( 

1 - I ) x x„ + T, 

2 3/ 2 3 1 

(365) 

'2 k 2 ’ ( : 

I 3- I l) X 3 X l* T 2 

(366) 

■3*3 -(’ 


(367) 


where Tj_, T 2 , and Tg are the components of the torque vector, T. 
If we define new variables by 


y i = I 1 X 1 


y 2 = * 2*2 


y 3 = h X 3 


(368) 
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then the Euler equations take the form 


(l 3 “ I 2 ) y 2 y 3 +T 1 
y 2 " (l* " r) y 3 y i + T 2 

X o 

y = /— - — \y y + T_ 
y 3 \I 2 ^7 12 3 


(369) 

(370) 

(371) 


Quantities yp y 2 , and y 3 are the components of the angular momentum vector, y. It 
is easy to show that when = T 2 = Tg = 0, 


&"*»»» * 0 


(372) 


In fact, in the present case, this is merely a statement of the principle of conservation 
of angular momentum. Now if the constraint on the torque vector is of the form 

I! T(t)|J < y (373) 

the theory of Sec. 3.2.2 is directly applicable. Therefore, the control law 


T* = Zffi- y (374) 

II y(t) II y 

will bring the system from an arbitrary angular momentum to zero angular momentum 
in minimum time. This corresponds to stopping the tumbling motions of the body in 
the shortest possible time. 

Eq. (374) shows that the torque vector takes on its maximum absolute value and is 
directed opposite to the angular momentum vector. 


3.2.3 Optimal Trajectories 


A simple form of optimal launch trajectory problem may be formulated in the 
following way. It is assumed that the vehicle is a point mass in a uniform gravitational 
field; aerodynamic effects are neglected, and the flat earth approximation is used. Ihe 
equations of motion then take the form (see Fig. 12) 


A u cos u 

• _ 1 2 _ j, 

f 

1 x„ 1 


(375) 
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Figure 12. Coordinate System for Optimal Trajectory Problem 


A u sin u 
1 2 


*2 = 

12 

X 3 

(376) 

*3 = 

'“"I = f 3 

(377) 

*4 = 

X 1 ' f 4 

(378) 

i 5 = 

Here 

x 9 (t) 

X 2 * f 5 

= horizontal component of velocity 
= vertical component of velocity 

(379) 


x 4 (t) = range 

x_(t) s height 
5 
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x„(t) = nondimensional mass = — ~ 

3' ' m(0) 

T 

u^(t) = nondimensional thrust = — 

max 

u (t) s inclination of the thrust vector with respect to the horizontal 
The thrust is given by 

T = - m(i M ~ constant (380) 

and satisfies the constraint 

0 < T < T (381) 

max 


which means that 

0 < u 1 (t) < 1 

Constants A and a are defined by 
T 

max 

m(0) 

T 

_ max 
“ “ m(0)fx 

It is required to program the thrust magnitude and direction (uj and u 2 respectively) 
such that a suitable index of performance is optimized. In various guises this problem 
has been studied by many investigators. (18,67,70,71) g ven so simplified a problem as 
this does not yield a closed-form solution. However, certain general features of the 
optimal control functions can be obtained, and the complete solution requires that one 
solve a set of differential equations with prescribed initial and terminal boundary 
conditions. 

The development that follows is based on the work of Isaev^ 16 ^ and includes the 
results of many other investigators as special cases. 

We formulate the general problem as follows. Given the initial conditions 

x.(0) = a (382) 

j = 1, 2, •••*, 5 
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find the control (ui, u 2 ) that transfers the system described by Eqs. (375) through (379) 
from the given initial state to a prescribed final state such that the function 

5 

J =E c i X i <¥ < 383 > 

j=l J 3 


is a minimum (maximum), where the final time, tf, is given. 

As shown below, the problems of maximum range, altitude, final velocity, and 
minimum fuel are special cases of this general formulation. The prescribed final 
state and performance function will, of course, differ in each case. Nevertheless, 
certain general features of the control functions are common to all. 


We will seek to obtain a solution via the maximum principle. The components of 
the Lagrange multiplier (costate) vector are defined by Eq. (78); viz. , 



X„ = 


~r( x i 


cos u + X sin 


“ 2 ) 


0 


0 


(384) 

(385) 

(386) 

(387) 

(388) 

(389) 


The hamiltonian is then given by 

"■Ey, 

j=i 3 3 
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or. In expanded form 


^[4( X 1° 08U 2 + V 1,,u 2)-“ X 3] 


+ X 4 X 1 +X 5 X 2 (390) 

Minimizing (or maximizing) H with respect to Uj and u 2 is equivalent to minimizing 
(or maximizing) 




An elementary calculation shows that H is maximized by 


U 1 = 1 


> 0 


-1 \ 

U 2 = t3n ^ 


and minimized by 


4>j_ < 0 


U l = 1 


<J> > 0 
2 


- 1*2 


u = tan — 
2 X, 


*2 < ° 


= - tan 


•1 X 1 
X 2 ' 2 


1 2 n 
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(397) 



For definiteness , we will consider only the case of maximizing! the performance 
index, J, of Eq. (383). In this case, the maximum principle requires that we minimize 
H. Accordingly, the optimal control functions are given by Eqs. (394) and (395). 

Now Eqs. (388) and (389) yield immediately 


X 4 X 40 

(398) 

X 5 " X 50 

(399) 

and, in turn, 


X 1 = " X 40 t+X 10 

(400) 

X 2 = " X 50 t +X 20 

(401) 

where X 1Q , X 20 . X^ Q , and X 5Q are constants. 


It follows that the optimal u 2 (t) is given by 


X 20 " X 50 1 

U 2 (t) " 1 -X t 
2 X 10 A 40 1 

(402) 

We therefore have the results that the optimal thrust inclination is a bilinear func- 
tion of time and the thrust magnitude is of the bang-bang type. This is a very general 
feature of the optimal control, since no stipulations have as yet been imposed on the 
performance index, J, or the boundary conditions, x.(tj). 

If we define 


Sgct 2 = 1, <J>2 > 0 


= 0 , <J» 2 c 0 

(403) 


t Minimizing J is equivalent to maximizing (-J). 
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then after substituting (394) and (395) into the set of Eqs. (375) through (379), (385) 
through (389), we obtain 


A V 8 *2 

Xi 

(404) 

X 2 2 2\l/2 g 

X 3 ( X 1 + \) 

(405) 

x* = -aSgct 2 

(406) 

*4* ’ X 1 

(407) 

*5 = X 2 

(408) 

*1 ' - X 4 

(409) 

X 2 ' 

(410) 

A /. 2 . 2\l/2 „ 

A ( x i +x 2 ) 

X 3 2 

(411) 

X 3 


X 4 “ ° 

(412) 

X 5 = ° 

(413) 


The set of Eqs. (404) through (408) describes the optimal trajectory. It is easy to 
show that the switching function, $ 2 , can change sign no more than twice. From 
Eq. (397) 

* . v 
2 
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which is obtained by elementary differentiation and making use of (406) and (411). By 
virtue of (409), (410), (412), and (413), this may be written as 



where the f}^ are constants. Since x^(t) is a bounded monotonic function, it follows that 
<f >2 can vanish for only one value of t, say t Q . Therefore, by Rolle's theorem, <J> 2 can 
vanish at no more than two points . 

It follows that the optimal trajectory can have no more than two powered phases. 
We now consider several special cases. 

1. Maximum Horizontal Velocity 


Given: 


Xj (0) = (415) 

j = 1, 2, ••••, 5 

* 2 <V = 0 

x 3 (t f ) = b 3 t f given (416) 

x 5 Ctf> = b 5 

Maximize J = x^ (tf) 

Therefore 0 ^ = 1 


c 


2 


= c. = c„ 


0 


Using the methods of Sec. 3. 1.2 to determine the boundary conditions for the Xj, 
we find 


X 1 <V - - 1 - \o 

X 4 (t f> = ° " X 40 


(417) 
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The solution of the problem is completely determined by integrating the 10 dif- 
ferential equations (404) through (413) whose boundary conditions are given by (415) 
through (417), with <tVj defined by Eq. (397). 

Note that by virtue of (417), the control law, Eq. (402), reduces to 

U 2« “ < 418 > 


2. Maximum Altitude 


Given: 


V 0) ' a j 


(419) 


3 - 1, 2, • • • • , 5 


X 3 (t f ) = b 3 


x 4 <¥ = b 4 


t f given 


(420) 


Maximize J = x g (t^) 


Therefore c_ = 1 
o 


C 1 ' °2 " °3 ■ c 4 ■ ° 


and 


(t f ) = 0 
X 2 (tf) = 0 

X 5 (t f) 


= -1 


(421) 


In this case , the solution to the problem is obtained by integrating the set of 
equations (404) through (413), with the boundary conditions given by (419) through 
(421). 

We note also that via (421), the control law (402) becomes 

u*(t) = - -r— — = constant (422) 

2 A 40 
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3. Maximum Range 


Given: 


X j (0) = a J 


X 3 <V = b 3 
X 5 ( V = ° 

Maximize J = x 4 (*{) 



t { given 


= 0 


j = 1. 2, 


5 


(423) 


(424) 


*i Of) “ 0 * \o 

X 2 Of) - » ■ X 20 < 425 > 

X 4 Of) a "I = X 4 o 

Again the solution is obtained by integrating the set of equations (404) through 
(413), but with the boundary conditions (423) through (425). 

The conditions (425) now yield the control law 

u* (t) = — X = constant (426) 

2 Ou 


4. Minimum Fuel 
Given: 


x 


j 



X 4 <¥ = b 4 
X 5 <V = ° 
Maximize J = -x g (t^) 


t^ given 


j = 1, 2, 


5 


(427) 


(428) 
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-1 


x l <V = 0 * X 10 

<V " « ' * 20 < 429 » 

*3 <V = 1 ' *30 


As before, the problem is solved by integrating (404) through (413), but for this 
case using the boundary conditions (427) through (429). 


Using conditions (429), the optimal thrust inclination is again a constant; 
viz. , 


u *(t) 



= constant 


(430) 


Remark: It has been shown that the optimal control for this problem is characterized 
by the following properties. 

1. The thrust magnitude is either zero or maximum (bang-bang). 

2 . The thrust inclination is , in the general case , a bilinear function of 
time. For specialized cases, it may be a constant or a linear function 
of time . 

3 . The optimal trajectory contains no more than two powered phases . 

In order to obtain the complete solution (exact form of the optimal tra- 
jectory), it is necessary to solve a set of nonlinear differential equations 
with two-point boundary conditions. This is not a trivial computational 
task. Some promising techniques for doing this efficiently have been devel- 
oped in recent years; a description of these is contained in Appendix A. 

It should also be noted that in the above analysis, there was the implicit 
assumption that a solution exists. While this may be true in most cases of 
practical interest, it must be recognized that not all combinations of con- 
ditions give a meaningful or well-defined problem. Generally, however, if 
there exists a control that satisfies the given boundary conditions, one may 
proceed to calculate an optimal control with the assurance that this is 
physically and mathematically meaningful. 
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There is one further mathematical feature of the solution: the control 
laws derived represent only necessary conditions for the optimum. They 
may not be sufficient. Sufficiency proofs are at present available only for 
linear systems . Again, in most cases of practical interest, this is a 
mathematical sophistry that does not seriously compromise the solution. 

3.3 AEROSPACE APPLICATIONS 


The literature on application of optimal control theory to guidance and control of 
aerospace vehicles is very extensive. Even the large list of references in this mono- 
graph is far from complete. One of the best survey papers on this subject is the one 
by Paiewonsky, which reviews the highlights and historical development of the 
theory and gives a critical evaluation of basic contributions. Other basic sources 
are the books by Lawden , (2) Chang, W Leitmann, ( 3 ), and Merriam^ 146 ^ and the 
papers by Miehle/ 18 ^ Lawden, ( 70 ) and Ho.( 135 ) 

The choice of examples to illustrate "typical" applications is largely subjective. 
Among the factors that may guide such a selection are theoretical novelty, practical 
importance, or mathematical elegance. Since the primary aim for the moment is 
pedagogic exposition, the examples that follow were selected for their didactic value 
and practical interest. The extensive detailed treatment that characterizes practical 
design problems of high order was avoided, since the details tend to obscure the 
essence. 

3.3.1 Lunar Soft Landing^ 32 ) 

A problem of some practical interest is the following. Assume that a lunar ve- 
hicle has arrived in the vicinity of the moon's surface. The motion of the vehicle is 
vertically downward, and the only forces acting are the thrust and lunar gravity (as- 
sumed constant). The thrust magnitude may be throttled between zero and some pre- 
scribed maximum. It is desired to calculate the thrust program that will result in a 
soft touchdown (altitude and altitude rate simultaneously zero) while minimizing the 
fuel expenditure. 


With these assumptions, the motion is governed by 

•• c m 
h = - — “ S 

m 

where 

h s altitude of vehicle above lunar surface 
m ■ instantaneous mass 
g = lunar gravity acceleration 


(431) 
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The thrust is given by -c m, where c is a positive constant and m ? 0. We also 
have the boundary values 


h <°> = x l0 
h (0) = x 2Q 

h (t f ) = 0 
h(t f ) = 0 


(432) 


(433) 


and the thrust magnitude constraint 

- a < m (t) < 0 (434) 

where a is a positive constant. 

Since the vehicle is assumed to be in the terminal descent phase, physical con- 
siderations indicate that x^q >0 and X2Q < 0. 

Minimizing the fuel expenditure is equivalent to minimizing the performance 
index 


J = m (0) - m (t f ) (435) 

This completes the mathematical statement of the problem. Before proceeding 
with a formal solution, certain simplifications may be effected in the following way. 
We have 


m d 

m “ dt (< " m) 

where n( ) denotes natural log of. 


Therefore Eq. (431) may be written as 
d 


h = 


c dt (^nm)-g 


Integrating between the limits of 0 and t^, 


• rn (t\ • 

h (t) = - c Xn - gt + h (0) 


(436) 
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Now h(tj) = 0 if, and only if 


m (t f ) 

cXn ^T = h <°>- gt f 

Solving for m (t^) , 

rh (0) - g t f ' 
m (t f ) = m (0) exp 

Substituting this in (435) , 


& 


(437) 


J = m (0) 


1 - exp 


h (0) - g t f 


It is apparent, therefore, that minimizing the fuel expenditure is completely 
equivalent to minimizing the total time, t^. In other words, we reduce to a minimum- 
time problem. 


We introduce the definitions 



u = m 

Eq. (431) is then represented by the system 



with the control constraint given by 
-a < u(t) < 0 


(438) 

(439) 

(440) 


(441) 
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and boundary conditions 


X 1 (0) = x 10 

X x (t f ) = 0 


x 2 = x 20 

X 

to 

**■•>» 

ii 

o 

(442) 

X 3 (0) = X 30 

x 3 (t £ ) S free 



In accordance with the discussion of Sec. 3.2.1, the minimum time problem 



is converted to a problem of minimizing x 4 (tf) where 



x 4 = 1 (443) 

x 4 (0) = 0 (444) 


Proceeding via the maximum principle, we find, for the hamiltonian, 


H = X 1 X 2 


A 2(^- S ) +X 3 U+X 4 


(445) 


where the Lagrange multipliers satisfy 


X 2 " ’"I 


X = - 


X 2 eu 


x = o 


(446) 

(447) 


(448) 


(449) 
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In order to minimize x 4 (t f ), we must maximize H. 
fore given by t 


The optimal control is there- 


u*(t) = -a 

= 0 


when cp < 0 
when <J> > 0 


(450) 


<t> = 



(451) 


hi Ref. 52 , a rather elaborate analysis is used to show that <I> cannot change sign 
more than once in the interval 0< t < tj. It is possible, in fact, to derive this result 
very easily as follows. 

Differentiating (451) and using (440), (447), and (448), we find 


<t> = 



(452) 


But Eq. (446) shows that = constant, while Xg = m(t) is a positive monotonic 
function of t. Therefore <j> cannot change sign in the interval, 0 ^ t <tj, which, in turn, 
means that <J> can change sign no more than once in this interval . Consequently, once 
the full thrust is switched on, it remains on until touchdown. We now seek to deter- 
mine the relation between x-^t) and Xgft), at which switching occurs. Let the interval 
over the thrusting phase be denoted by (0,t^,). Denote by x^ Q , x^q, and Xg 0 the altitude, 
altitude rate, and mass at the instant of switching to full thrust. Then from Eq. (439), 


x 2 (t) 


- c In 





gt+x 


20 


0 < t ? t m 
T 


Using Eq. (438), we obtain 


(t) 



(r) dr + x 


10 



(453) 


+ ct-|gt 2 +x' 0 t + x' 0 (454) 

0 < t < t 

T 

+A somewhat lengthy analysis shows that the singularity condition <t>=0 is incompatible 
with the physical constraints of the problem. See Ref. 52. 
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For a soft landing, we must have 


^2 X 2 (^»p) 0 


Therefore Eqs. (453) and (454) become 
0 = - c In 


g t + x 
6 T 20 


'4 * t ) 

“ V s 30 T> ' Ho T> r 2 


(455) 


+ x ' t + x / 
20 T X 10 


2 

T 


(456) 


These may be solved for x^q and x^q as follows. 


x„„ = c In 
20 


1 Ho T - 


+ g t n 


/ cx 30 / a \ l 

10 = "IT “l 1 "^ Tj- Ct T-2 gt ' 


2 

T 


(457) 


(458) 


K we eliminate t^ between these two equations we obtain a relation of the form 
F(x(o, x/ 0 ) = 0, which determines the altitude and altitude rate at which one switches 
to full thrust. A fairly simple form for F is obtained by using the approximation 


In 


/ > a t_ , /a t \ 

L \ _ T _ 1 T\ 

\ x' *F/ ~ x' 2 \ x' / 

' 30 ' 30 \ 30/ 


which for 


m (t T ) 
**0 


> 0.75 


(459) 


is accurate to within 2 percent. 

In this case, Eqs. (457) and (458) become 

/ , 2 
X 10 = at T 


(460) 


X 20 = "2at T -bt T 2 
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(461) 



where 


•fl^l 




Since the ratio of maximum thrust to initial mass must be greater than g, we see 
that a > 0. Also, since x^g is positive, it is apparent that the only value of t-p that is 
physically meaningful is 




(462) 


Substituting this into (460) yields 


F ( x w^o) =!*Wo + 2 


10 


(463) 


Physical considerations dictate that x 10 > 0 andxjg < 0, since altitude is positive 
and the vehicle velocity is in the negative x^ direction. Furthermore, the inequality 
(459) , together with the relation 


x 3 (t) = m (t) = x' Q -a t 


leads to 


0 < t < t 


30 

0 < t_ < — — 
T 4a 


1 30 

Using t T I = , we find that the range of values of interest in the xp - x 2 

plane is given 


(464) 


Xork 2 


o 


/ 3 

16 U ) 


_ a/ X 30\ Jb/^30\ 

2\a / 16 V a / 


< x„ < 0 
2 


(465) 


(466) 


A plot of the switching function, (463), for the range of values given by (465) and 
(466) is shown in Fig. 13. 
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Figure 13. Plot of Switching Function and Free-Fall Trajectory 

Assume now that the vehicle is at some initial altitude, x^q , and with an initial 
altitude rate, x 2Q . The free-fall trajectory is readily obtained from 

x i * X 10- 2-i[ X 2 2 - X 20 2 ] < 467 

This is also plotted in Fig. 13. 

The form of the optimal trajectory is now apparent. If the initial conditions are 
such that F(x 10 ,X 2 q) > 0 — i.e. , the point (Xjq,x 2 q) lies above the switching curve as 
shown in Fig. 13 — the vehicle is allowed to continue in free fall until F(x 1 ,x 2 ) = 0. 

At this point, full thrust is switched on and remains on until touchdown. 

3.3.2 Optimal Control of Booster Vehicles 

The fact that a complete solution is available for the problem of optimal control 
of a linear system with a quadratic performance index (Sec. 3.2.1) has motivated the 
effort to relate pragmatic design requirements to this format. As a matter of fact, 
a far from trivial problem is that of formulating a meaningful optimality criterion for 
a launch vehicle autopilot. One usually requires that the control system stabilize the 
unstable airframe. In addition, it is desired that the deviations from desired attitude 
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be small, and that the angle of attack be small in order to reduce bending loads, engine 
angle deflections, etc. With the exception of stability, none of these automatically re- 
sults from the quadratic performance criterion. 

Several recent studies have attempted to relate meaningful performance require- 
ments in booster autopilot systems to the quadratic performance criterion. We shall 
consider in detail the papers by Tyler and Tuteur, Fisher, ( 14 ) Bailey, ( 122 ) and 
Tyler. ( 136 ) Rather than discuss these individually, we will take a unified point of 
view. The results of particular studies will be shown to be special cases of a general 
approach. Among the problems investigated are the following. 

1. Stabilizing a flexible booster using multiple sensors. ( 14 ) 

2 . A model-following system. ^ 136 ) 

3. Minimizing deviations from desired states.^ 22 ) 

4. Relating the characteristic equation of the optimal system to the weighting 
matrix elements in the performance index. ( 43 ) 


These will be referred to as problems 1 through 4. As a preliminary, the results 
of Sec. 3.2.1 will be expressed in a variety of alternative forms to facilitate treatment 
of individual problems. 

In the notation of See. 3.2.1, the system considered takes the form 

x = Ax + Bu (468) 

y = Gx (469) 

which are merely Eqs. (321) and (322) repeated here for. convenience. For the per- 
formance index, we take the following version of Eq. (324) 


= lim j l {. 

t -,oo Jq \ 


f/ T t T \ 

x G QGx+u Ru |dt 


(470) 


In this case, the optimal control is given by Eq. (345). 


u*(t) = - K x (t) 


(471) 


-1 T 

K = R B P 


where R, B, and P are c exist ant matrices, the latter of which is obtained as the solution 
of the steady-state Riccati equation 

P A + A T P - P B R -1 B T P = - G T Q G (472) 
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This is the simplest case; the optimal control (471) is merely a constant matrix 
times the current state of the system (feedback principle). If we take for the perform- 
ance criterion 


J 



G 


QGx + u 


T 



dt 


(473) 


then the form of the optimal control is still given by (471) except that P, and therefore 
K, are now time-varying, with P obtained as the solution of 

-P = PA + A T P - PBR _1 B T P +G T QG (474) 

The above is merely a summary of the results obtained in Sec. 3.2.1. 

Alternative forms of this solution have been obtained bv Kalman^) and Merriam^ 6 ) 
and have been used in the studies by Tyler, f 1 ^ 6 ) Bailey/ 1 ’ and Tyler and Tuteur/ 1 ^) 
which will be discussed here. It will clarify the discussion to relate these results to 
the ones derived in Sec. 3.2.1. 


To obtain Kalman's equations, we first postmultiply Eq. (474) by x, obtaining 

. X -IT T 

-Px = PAx + A Px - PBR B Px + G QGx (475) 

The costate (Lagrange multiplier) vector is obtained from the optimal return 
function, W(x,t), via Eq. (309); viz., 



which in the present case may be expressed as 
X = - 2 Px 

using Eq. (333). However Kalman's performance index contains the factor l/2 in front 
of the integral of Eq. (470), which means that his optimal return function is half of that 
defined by Eq. (325). Furthermore, to obtain the minimum of the performance func- 
tion, he minimizes the hamiltonian defined by Eq. (310), whereas the conventional 
(maximum principle ) approach requires the maximization of the hamiltonian. This 
means that instead of X, as defined above , we must take 


X 


j_ aw 

2 ax 


Px 


(476) 
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to conform with Kalman's assumptions. Noting that 

X = Px + Px (477) 

we obtain, after substituting (476) and (477) in (475), 

X = -A T X-G T QGx (478) 

where we have used (468) and (471). Furthermore, Eqs. (468) and (471) show that the 
optimal trajectory is described by 

x = Ax - BR 1 B T X (479) 

in terms of the costate vector, (476). The last two relations may be written as 


x 


A 


-BR 


-1 





|_— G Q G 



(480) 


It is apparent that the costate vector, X, is the adjoint of x. Consequently, the 
eigenvalues (closed-loop roots) in the characteristic equation 


-1 T 

(Is - A) BR B 

G T QG (Is + A) 


(481) 


consist of the eigenvalues of the optimal system and their mirror images about the 
imaginary axis in the s plane, which belong to the adjoint system. 

We note further that the optimal trajectory is also described by 

x* = (A - BK) x (482) 


using (468) and (471). The characteristic equation of (482), together with its adjoint, 
is therefore given by 

| Is - (A - BK) | • jls - (A + BK) | = 0 (483) 

Thus the roots of (481) and (483) must be identical. This fact will be used later 
to establish a relationship between the eigenvalues of the optimal system and its feed- 
back gains. 
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We turn now to a discussion of Merriam's method. The performance criterion is 
taken as 

J = jf [( y D " y f Q ( y D " y ) + ( U D “ U ) T R ( U D " U )] d * (484) 

Here y is the output vector given by Eq. (469), and y D is a reference or desired 
output. Similarly, ujj is the desired or reference control, and u is the actual system 
control vector. The matrices Q and R are assumed to be diagonal matrices that may 
be time-varying . 

The optimal return function is defined in a manner similar to Eq. (325) as 
follows. 

Jkf 

W (x, t) = Min f ^y D - yj T Q ^ - y j + ^u Q - uj T R ^ - ujjdt (485) 


Proceeding as in Sec. 3.2.1, we find 


aw 

= Min 

3 t 

u 

_I_ 1 

/3 W\ T 

+ 1 

Ux ) 


Mj n (y D - y) T Q (y D - y) + - u) T r (u d - u) 


(Ax + Bu) 


The optimal control is found to be 


* 1 / 3 W \ 

u* = u_ - — R B I ] 

D 2 \ dx / 

Assuming a solution of the form 


u* = u„ - - R B 


XX XX XX 

(X, t) = p (t) - 2 ^ p (t) X (t) + 2 P fly W X fl (t) X (t) 

a=l )J=1 y=l pr p r 
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(490) 


- -2p + 2 V 

a 




Substituting (487) through (490) in (486) and assuming that p ^y = Py^ leads to 


n f , n 

(p + F) - 2 V' p + F I x + V* p + F \x Z 

5=il« «J <* h I 00 


n n 


+ 2 


SS * fir+ 2 Fpy + 2 Fy/3 ] x ^ 


Xy " 0 


(491) 


The quantities F, F^, and F ay depend only on the components of the matrices A, 
B, Q, R, and G. In order for Eq. (491) to be satisfied, each of the coefficients on the 
left-hand side of this equation must vanish independently. This leads to 


-p = F 


(492) 

1 

ii 

p" 1 


(493) 

~ *Py 2 ( F py + F yp) 

a, p,y , = 1, 2 , , n 

(494) 


- p = F 
act aa 

Eqs. (492) through (494) are a set of [l/2 n (n-1) + 2n + l] differential equations 
for determining the various p's. Noting that 


W (x, t f ) = 0 

via (485), we find, for the initial conditions. 


P = P a (tf) = Pj3 y (V = 0 


(495) 


Having calculated the p's, the optimal control is obtained directly from Eq. (487), 
making use of (490) . 

We now consider in detail the four problems stated at the beginning of this section. 
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Problem 1 


In order to investigate the possibility of stabilizing a flexible launch vehicle using 
the theory developed here, we will consider the system whose dynamics is expressed 
by 

m Uq (a - 0) = T^i -L^a-(mg cos 0 q) 0 (496) 


0 = n 6 +fi a 
c a 


a + 2 £ u q +co q 
4 1 S 1 1 4 1 1 4 1 



(497) 

(498) 


The symbols have the following meaning. t 
g = gravity acceleration 

I = moment of inertia of vehicle about pitch axis 

y 

l = distance from mass center of vehicle to engine swivel point 
c 

£ ^ = distance from mass center of vehicle to engine swivel point 

L = aerodynamic load per unit angle of attack 

m = mass of vehicle 

= generalized mass of first bending mode 

= generalized displacement of first bending mode 

T = control thrust 
c 

Uq = forward velocity of vehicle 
a = angle of attack 
6 = thrust angle deflection 

0 = attitude angle 

0 Q = steady-state attitude angle 
T l 


c c 



tSee Ref. 167 for a more complete account of this problem. 
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a 


L l 

a a 

I 

y 


£ = relative damping factor for first bending mode 

(jq_l = undamped natural frequency of first bending mode 


Conventional instrumentation whose output is a linear combination of the state 
variables is described as 


Rate Gyro : 

V ■ k r(® + °g 1, 4i) 

Position Gyro : 

9 PG "Hg'NK 
Accelerometer : 

r T 6 - L a . ... . 

0 - -tH/S) 


0 = K 

A a 


m 


Angle o f Attack Sensor : 

0 = K («-a {1 \) 
a a\ a 1 / 


(499) 


(500) 


(501) 


(502) 


Here, Kj^, K a , Kq,, and Kp are the gains associated with the respective sensors; 
Otj, is the thrust acceleration, and the oC 1 ) quantities are the normalized bending mode 
slopes whose subscripts indicate the location of the sensor along the vehicle. 


If we define 



then the system described by Eqs. (496) through (498) may be expressed as 
x = Ax + Bu 


(503) 
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with 



This is equivalent to the system (468) and (469), with G = I, the unit matrix. 

Taking a performance index of the form (470), the optimal control is given by 
Eq. (471); viz., 

if(t) = - K x (t) (504) 

Note that in the present case, R is a scalar, since u is a scalar and K in the above 
expression is a lxn matrix (i.e. , a row vector). The control is optimal in the sense 
that the performance index (470) is minimized; this criterion has as yet not been 
directly related to control system requirements, such as response speed, overshoot, 
and damping. These may indeed be determined once K is calculated, since the closed- 
loop poles of the system are then directly obtainable. 

Obviously, K is a function of Q and R, and therefore the properties of the "optimal 
system" are directly (though not simply) related to the choice of Q and R. In Fisher's 
paper, ( 14 ) R is taken as unity, and an iterative process that will lead to a system of 
acceptable dynamic qualties is performed on Q (starting with some arbitrary choice). 
Basically, one guesses Q, determines K, and then calculates the closed-loop poles of 
the resulting system. If this does not yield an acceptable system from the point of 
view of dynamic response properties, the process is repeated with a new choice of Q, 
etc. The method is completely cut -and -try . Fisher gives no technique whereby 
successive iteration converges to some desired form. Previous experience no doubt 
is a valuable guide in obtaining meaningful results. 
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A basic property of the optimal control (504) is that there is one feedback loop for 
each control variable. It is known, however, that acceptable systems may be designed 
with only one or two feedback loops. Of course, the more feedback loops employed, 
the greater the capability to achieve specified performance. It is instructive, there- 
fore, to investigate the means whereby an n** 1 - order system may approximate the 
optimal control (504) with fewer than n feedback loops. Let us assume that m sensors 
are used and that the output of each sensor is given by 

n 

z i = £ K X (505) 

j=l J 

i = 1, 2, ♦ • • • , m 

We note, for example, that two rate gyros may be viewed as two different sensors 
if their K R and q^ 1 ) are different. Thus for the system (496) through (498), the use 
of one position gyro and two rate gyros and two accelerometers (having different gains) 
will provide five Independent measurements that are linear combinations of all the 
state variables. 

If now we use 


u 


/ 


m 


-£ 

4=1 


c. z. 
i l 


(506) 


to denote the control function employing the sensors (the Cj are as yet undetermined 
constants), then (504) and (506) will be equivalent if 


n m m n 

E k i x i = Lvi = E c iL b i 3 X j 

i=l i=l i=l 1=1 J J 


This leads to 


(507) 


T T 

b c = - K (508) 

b = m x n matrix 
c s m x 1 matrix (i.e. , m vector) 

K = 1 xn matrix (i.e., n-dimensional row vector) 
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If in = n, then the components of c are uniquely determined from Eq. (508). If 
m < n, then there is no unique solution for c. However, for an approximate solution, 
we may seek the value of c that minimizes the error criterion 

E = || b T c + K T H 2 = (b T c + K T ) T (b T c + K T ) (509) 

This is easily found to be 

~ = 2 |bb T c + bK T j = 0 
or 

bb T c = -bK T (510) 

The matrix equation (510) is a system of m linear equations for the determination 
of the m components of c . 

It should be noted that the approximate solution will not be optimal and that stability 
is not guaranteed. Thus each particular solution must be investigated individually to 
determine if performance is acceptable. 

Problem 2 


The given system is again described by 
x = Ax + Bu 
y = Gx 


It is desired that the system behave as the "model" 

V = lt? 

In order to achieve this, we formulate the performance criterion as 


= J lim f Vy - Ly) T Q (y - Lyj + u T Rul 

t. -+<*> Jq l J 


dt 


(511) 

(512) 


(513) 


(514) 


Presumably, an appropriate selection of the weighting matrices, Q and R, will 
yield the desired result. An analysis completely identical with that of Sec. 3.2.1 
shows that in the present case , the optimal control has the form 


u (t) 


--1 T 
R B 


P+G Q(GA-LG) 


(515) 
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where P is the solution of the steady -state form of the matrix Riccati equation 


i 


» 1 
PA + A P + G QG-PBR 


B P = 0 


and 


(516) 


■» T T 
R = B G QGB +R 

» --1 T T 

A = A - BR B G Q (GA - LG) 

a ^-1 T T 

Q = Q-QGBR B G Q 

G = GA - LG 


It is convenient to write Eq. (515) in the form 


u (t) 



where 


—1 T 

K = R B P 
r 

*-l T T 

K m = R B G Q (GA - LG) 


(517) 

(518) 

(519) 

(520) 

(521) 

(522) 

(523) 


The control is thus expressed as a sum of two terms, one of which depends on the 
solution of the Riccati equation, and the other of which depends on the properties of 
the model. 

We now seek to obtain some insight into this result. More specifically, what must 
be the stipulations on Q and R in order to have the given system exhibit the dynamic 
properties of the model? For this purpose, we consider the second-order system 
described by 



0 

1 


0 

A = 

-A„. 

-A„„ 

B = 

B 


21 

22 


21 


0 1 
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with weighting matrices 


Q = 


Q 


11 

0 


Q 


22 


R = R^, a scalar 


and the matrix describing the model 


L = 


21 


22 


We then find 


R = Q B + R 
11 z2 21 11 


A n = 0 


A 12 = 1 


21 


A 21- 


®21 Q 22 (L 21 " A 21 ) 


B Q + R 
21 ^22 11 


A 22 " A 21 


B 21 Q 22 (L 22 " A 22> 


B Q + R 
21 ^22 11 


Q = Q 
22 22 


2 _ 2 
B Q 
21 22 

2 

B Q + R 
21 ^22 11 


Q = Q 

^11 11 


K = 
m 


B Q 
21 z2 


B Q + R 
21 ^22 11 


( L 21 ” A 2l) ( L 22 “ A 22) 


The determination of Kj. depends on the components of the P matrix, which are 
found from Eq. (516) as follows. 
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T* 

It Is not necessary to calculate P-q, since u(t) depends on the product of B and 
P in Eq. (522), and the ®11 element that would multiply Pll is zero. 

Now since we are interested primarily in minimizing the "error" term (y-Ly) in 
(514), it is logical to take Q » R. In the present case, it is sufficient to let Q 22 >>R n- 
This leads to 

K m ~ 4 [( Lai ' M N ' A ")] 


while 



which means that 



12 


0 


and, in turn, 

K « 0 
r 


Consequently, the optimal control reduces to 

u (t) a* - K x 
m 


r 


— — 

1/ \ / \ 


1 

(L - A„ L - A \ 



H 21 21 / \ 22 22 ) 





L 2J 


= — — - A„ A x, + ( L„_ - A„ \x„ 

B \ 21 21} 1 \ 22 22 J 2 


(524) 
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Substituting this in (511), 
x = Ax + Bu 
= (A-BK^x 

But 



Thus the optimal system behaves exactly as the model. It is instructive to 
examine these results in terms of conventional transfer functions and feedback loops. 

Fig. 14 depicts the feedback loops introduced by the optimal control function (524). 
The system with the feedback loops, shown in part (a), reduces to the form shown in 
part (b) after some elementary simplifications. The distinctive feature of this result 
is that near-infinite gains are not required to make the given plant behave as some 
prescribed model. In the conventional model -foil owing scheme, shown in Fig. 15, the 
overall transfer function is 

c (s) _ G (s) [1 + K M (s)] 

R (s) 1 + K G (s) 

^ + G (s) M (s) 

tt + G(6) 

from which it is apparent that 

TW - M(s) forK ~" 

Thus the scheme generated by optimal control theory appears extremely attractive. 
In practice, however, there are two fundamental limitations. First of all, there is no 
guarantee that response to external disturbances is acceptable. Second, all the state 
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Figure 15. Conventional Model -Foil owing System 


variables must be accessible for measurement. For moderately high-order systems, 
the latter condition may be difficult to realize. Nevertheless, the above methods are a 
novel approach to an outstanding problem and may prove useful in certain applications. 

Problem 3 


Instead of including the model directly in the performance index, one may attempt 
to match the system output to the model output. This leads to a performance index of 
the form (484), which is repeated below. 
t f 

J = l [( y D “ y ) T Q ( y D " y ) + ( U D " U f R ( U D ' U )] dt (525) 

The distinction between this and the types previously considered is that the upper 
limit of integration, t^, is finite. Consequently, it is anticipated that the optimal con- 
trol function will contain time-varying gains. Bailey^ 122 ) applied Merriam's technique 
to the problem of optimizing the performance index (525) for the yaw dynamics of a 
launch vehicle whose motion is described by 
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(526) 


* = 


Vfi 

\ * ', D o 


T X 

• C c . 
Y fi 


Here 


( 527 ) 


I = moment of Inertia of vehicle about yaw axis 
z 

l = distance from mass center of vehicle to engine swivel point 
C 

i = distance from mass center of vehicle to center of pressure 
P 

L fl = aerodynamic load per unit angle of attack 

P 

m = mass of vehicle 

T = control thrust 
c 

Uq = vehicle forward velocity 

Y = normal displacement measured parallel to an Inertial reference 
6 = control engine deflection 
$ = yaw attitude angle 


Via the definitions 



u = 6 


The pertinent system matrices become 
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A = 


0 


m U, 


Vi 

Vo 


0 0 


0 0 


0 — (T + La) 

m c P 


V, 


B = 


c 

m 


T H 
c c 


G = I, the unit matrix 

The optimal control in this case is given by Eq. (487), which, in combination with 
(490) and the above value of B, reduces to 


u = u 


D R 


11 


+ E p». 

0=1 


01 “0/ I_ ( P 3 + S P/33 x /?yj 


(528) 


where R = R.^ is a scalar, since u is a scalar. The p's are obtained from Eqs. (492) 
through (494) , which in the present case take the formt 

2 


B 


- p = Q Y +A p+p+A p 
P 1 11 D 11^1 V 2 31*3 R 


11 


11 


P 1 P 11 


B B 
31 11 


R 


11 


( P ll P 3 + P 1 P l 3 ) ' 


B 31 

R 11 Pl3 P 3 


(529) 


tWe recall that p = p„ 

a0 )3a 
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®ii / \ 

“ P 2 = Q 22 Y D " TT^ P 1 P 12 R^ - ( P 12 P 3 + P 23 P l) 


31 

R P 23 P 3 
11 


-P 3 = Q 33 + P 4 ~ Tf^ P 1 P 13' “R^ - 


( P 3 P 13 + P 1 P 3 3 ) 


31 


Pn P, 


R 3 33 


B 


11 


- p 4 ‘ <S 44 i|, D +P l A 14 +P 3 A 34‘ R^ P 1 P 14 


B B 
31 11 


R 


11 


( P 3 P 14 +P 1 P 34)- 


B. 


31 


Po P, 


R 11 3 34 


11 


B H 3 

Q ll +2 A 11 P 11 +2P 12 +2 A 31 P 13 R 1X P H 


2 B 31 B 11 . J” ^ 

R n P n p i3 R n 13 


‘ P 12 " A 11 P 12 + P 22 + A 31 P 23 _ R^ P 12 P 11 


B B 31 

^ /p p +p p\“ "r P 13 

R y 12 P 13 v 23 y l) R n 13 


®31 B 


23 


(530) 


(531) 


(532) 


(533) 


(534) 
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B 11 

P 13 A 11 P 13 + P 23 + A 31 P 33 + P 14 R n P 11 P 13 


B 31 B 11 / 2 


R 


11 


/ 2 \ 31 

\ P 13 + P 11 P 33/ R n P 13 P 33 


P 14 “ A 11 P 14 + P 24 + A 31 P 34 + A 14 P 11 + A 34 P 13 " R n P 11 P 14 


B 31 B 11 / + n v hl_ D „ 

R u \ P 14 P 13 P ll P 34j R 1X P 13 P 34 


. B U . 2B 31 B U „ 2 

p _ T-> P ~ T* P-IO Poo -O Pn 


'22 22 R 12 


R 12 23 R 1X 23 


B H B 31 Bl1 /„ , „ „ \ 

P 23 ' P 24 “ P 12 P 13 " R u ( P 13 P 23 P 12 P 33j 


B 


31 


P„„ P, 


R 23 33 


11 

“ Pr ~ A-,, P 10 + ^ Poo T> P-IO Pi A 


B 31 B 11 


24 14 P 12 34 ^23 R n '12 '14 R u 


( P 23 P 14 + P 12 


B_ 


31 


R P 23 P 34 


P 33 Q 33 +2P 34~ R u P 13 


2 B 31 B 11 _ ^31 2 

P-IO Poo T> Po 


R 


11 


13 33 R 33 


(535) 


(536) 


(537) 


(538) 


”34) 

(539) 


(540) 
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“ P 34 A 14 P 13 + A 34 A 33 + P 44 “ R n P 13 P 14 

B 31 B 11 / \ B 31 

R n v 13 ?34 + 1)33 Pl4 ) " R n 1)33 1)34 

' P 44 Q 44 + 2 A 14 P 14 + 2 A 34 P 34 ~ R n P 14 

2 B 31 B 11 B 31 2 

R P l4 P 34 " R X1 P 34 


(541) 


(542) 


The weighting matrix Q has been assumed diagonal . 



Q = 



0 




and the subscripted A and B terms are the components of the respective matrices. 

The set of equations (530) through (542), when solved with the initial conditions 

f«<V - ' 0 <543) 

a, P = 1, 2, 3, 4 

yield the p's which, when substituted in Eq. (528), give a control function of the form 
u (t) = U D (t) +K (t )* +Kjj(t)4> + K y (t) Y + (t) Y (544) 

Bailey^ 33 ^ gives some results for a specific selection of and the that 
represents a compromise between the desirability for "tight" control and m ini m u m 
drift. The superior results obtained by using time-varying gains in the feedback loops 
must be weighed against the added complexity of the control system. Furthermore, the 
influence of the neglected higher-order dynamic effects, nonlinearities, and parameter 
uncertainties , in addition to extraneous disturbances, remains to be evaluated. 
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Problem 4 


We again consider the system 
x = Ax + Bu 
y = Gx 

with the performance criterion 


J = lim 




x T G T QGx +u T Ruj dt 


Luj i 


(545) 

(546) 


(547) 


The optimal control has been found to bet 


u (t) = - K x (t) (548) 

where K is a constant matrix. Substituting this in Eq. (545), we obtain the equation 
of the optimal system 

x = (A - BK) x (549) 

We seek to determine the relationship between the components of K (feedback 
gains) and the components of the weighting matrices, Q and R. 


An alternative derivation has shown that the optimal system and its adjoint satisfy* 


r.' 

X 


-1 T 
- BR B 


T 

-G Q G 


r 


(550) 


The characteristic equation of the optimal system, together with its adjoint, may 
therefore be obtained from (549) as 


1 1 s - (A - BK) | • 1 1 s -'(A + BK)| =0 
or, from (550), as 

-1 T 

Is - A BR B 

= 0 

T T 

G Q G I s + A 


tSee Eq. (471). 
*See Eq. (480). 


(551) 


(552) 
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f 

f 


Since (551) and (552) are the characteristic equation for the same system, they 
must have the same roots. Therefore, by equating coefficients of like powers of s, 
we obtain a set of equations that relates the components of K to the components of Q 
and R. However, this brute-force approach soon runs into an avalanche of complicated 
algebra that yields little insight and no general results . 

Tyler and Tuteur^ 13 ^ show that if Q is a diagonal matrix and R is taken as the unit 
matrix, then for large Q^, the optimal system exhibits the properties of a Butterworth 
function. Their design approach involves the expansion of the determinant equation 
(552), which is the characteristic equation of the optimal system, together with its ad- 
joint. A root locus is drawn for a specific Qy as the "system gain" and with all other 
Q.. set equal to zero. After a Q matrix is determined in this fashion, the calculation 
o^ the K matrix is straightforward. 

Rather than explore this technique in detail, we will indicate a simpler and more 
elegant approach that affords a higher degree of insight. This is based on a crucial 
simplification of the characteristic equation (552), which will now be developed. To 
do this, we will make use of the following two relations between determinants. 

a j8 

= | a I • |6 -y a -1 0| (553) 

y 6 

-yfll = 4 n_m lMl m - I (554) 

where 

a = m x m matrix (nonsingular) 

P = m x n matrix 
y = n x m matrix 
6 s n x n matrix 
(i = scalar 
I. = i x i unit matrix 

l 

The first of these is proven in Bodewig's book(l^) (p. 217), and the second is due 
to Plotkin.^ 173 ) Applying (553) to (552) yields 

Isljj-Al • Ifsljj + A^ - G T QG (8 I q - A) -1 BR -1 B T | = 0 (555) 
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But | s ^ - a| =0 only for those values of s that correspond to the open-loop poles. 
Consequently, the characteristic equation for the closed-loop (optimal) system reduces 
to 

| (s I n + A T ) - G T QG (s I q - A)* 1 BR _1 B T | = 0 (556 

Furthermore, 

(s I n + A T ) - G T QG (s I n - A)' 1 BR _1 B T 

- ( 8 + aT )[‘ ' K + A T gT « G ( s - A ) _1 BH ' 1 B T ] 

which means that 

I s I n + A T | ' I Jjj ” ( s T n + aT ) _1 g t Qg(si d - a)- 1 br _1 b t | = 0 

Now since | s I n + A T | =0 only for those values of s that correspond to the open- 


loop poles of the adjoint system, Eq. (556) is further reduced to 

I ^ “ ( s T n + A T ) 1 G T QG |s I n - a| 1 BR 1 B T | = 0 (557) 

An application of (554) yields 

i -IT/ T\-l T / \-l 

^m ” R B ( S ! n + A ) G QG ( SI n ' A ) B| = 0 (558) 

We now observe that the quantity 

L(s) = G ^1 s - a| 1 B (559) 

is the matrix transfer functiont for the system (545) and (546). Also, 

T T / T\-l T 

L (-s) = -B (is+A ) G (560) 

Therefore, Eq. (558) becomes 

I I m + R 1 L T (— s) Q L (s)| = 0 (561) 


tSee Ref. 174, Sec. 3.4. 
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This is, in fact, a multidimensional form of Chang's Root Square Locus method. W 
In principle , one could plot the loci for Q^/Rjj as a parameter and determine the Q 
and R that yield acceptable dynamic response. From this, the gain matrix, K, and 
hence the optimal system, could be determined. 


Consider, for example, the single-input/single-output system for which 


G ^ 1 x n matrix 
B = n x 1 matrix 
Q s Q s scalar 

R = R 11 s scalar 

Then Eq. (561) takes the form 


1 + 


Q 


11 


R 


11 


L (-s) L(s) 


0 


where 


Y(s) 

U(s) 


L(s) 


= N(s) 
D(s) 



r < n 


(562) 


(563) 


and Zj and Pj are the zeros and poles respectively of the open-loop system, with k a 
known factor. 


Defining 



we express Eq. (562) as 


r k 2 Q 


0 = 1 + 


11 


11 




lii 


= T(s) T(-s) 


(564) 
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Thus standard root locus techniques can be used with Q^/R^ as the variable 
gain. A given value of this variable gain yields pairs of roots Qj, each pair repre- 
senting a point in the s plane , together with its mirror image about the Imaginary 
axis; i.e. , a closed-loop pole for the optimal and adjoint system. If these closed-loop 
roots are acceptable from a dynamic response point of view, then the corresponding 
Q 1;l and R 1:L are used to determine the optimal gain matrix, K. 

Note that the optimal system contains only those roots in the left-hand s plane. 

It can be shown that the optimal system obtained by this procedure is always stable. 
Therefore, the mirror image poles belong to the adjoint of the optimal system. 

In simple situations, the components of K may be directly related to 
For the second -order case described by 



we obtain, after equating Eq. (551) to (552), 


2 

s + s 


+ c + c K^J | - s + c + ^b + c K.jj 


/ 2 / 2 , \ ^11 2 
= ^s +as+bj^s - a s + b I + — — c 


Equating coefficients of like powers of s and solving for and K2 yields 
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Thus, because the relationships between the dynamic characteristics (such as 
closed -loop frequency and damping ratio) and the feedback gains are known, these can 
be related directly to the components of the Q and R matrices. 


3.3.3 Optimal Re-entry From Orbit 

The guidance and control problem for a manned spacecraft during the atmospheric 
re-entry phase must take account of two primary figures of merit: the surface heating 
rate and the accelerations experienced by the crew. To formulate the problem mathe- 
matically, the vehicle is assumed to be a point mass moving about a spherical, non- 
rotating earth with an inertial coordinate frame as shown in Fig. 16. The motion is 
described by 


h = - V sin y 

. D 

V = g sin y 


g cosy V cosy L 

V ~ r + h "mV 
E 



(565) 

(566) 

(567) 
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where 


D = drag force 

g = gravity acceleration 

h = altitude above earth's surface 

L = lift force 

m = mass of vehicle 

r„ = radius of earth 
E 

V = velocity of vehicle 
y = flight path angle 

The rate of heating due to atmospheric friction is given by 

K 4 p l/2 V 3 (568) 


where is a heating constant and p is the atmospheric density. The expression (568) 
represents the heating rate per unit surface area, so that for a particular vehicle, K 4 
is a function of the surface area of the vehicle nose region. 

The acceleration sensed by the crew is due only to the aerodynamic forces and 
is given by 



The limit of human endurance is a function of the acceleration magnitude and the 
length of time applied. Within the range of 5 to 10 g's, this endurance limit is roughly 
a linear function of the acceleration squared . 


In view of these observations , a reasonable measure of performance may be ex- 
pressed as 


J 



k 4 p 1/2 v 3 + 



(570) 


where K 7 is a relative weighting constant between the heating and acceleration effects . 
It is convenient to define the additional state variables 
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(571) 


\ = f K p l/2 V 3 dt 


- t kL t pi 

A n> 2 


1 


dt 


(572) 


with 


* 4 <y - o 


x 5 #,) - 0 


and add the equations 


*4 - P l/2 V 3 


(573) 

(574) 


(575) 


X 5 = 


x 2 x.2 

L + D 


m 


to the system (565) through (567). 

The performance criterion becomes, therefore, 
J = x 4 (tf) + k 7 x 5 (ty 
For notational convenience, we define also 


X, 

= h 

r = K 

1 


E 1 

X 2 

= V 

g = K 2 

X 3 

= 7 

m = K 


(576) 


(577) 


We approximate the atmospheric density by the exponential model 

n k 2 Vi 
p = k 5 e 


(578) 
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where Kg and Kg are constants . 


The aerodynamic forces are expressed as 


L -T» X 2 2K 10 C L 


(579) 


D = -ipx o 2 K n C 
2 2 10 D 


(580) 


where K^ is a reference area and and C D are the lift and drag coefficients. The 
latter are approximated by 


C = K sin u cos u 

Ij 11 


C = K„ +K„sin u 
D 12 13 


(581) 

(582) 


where K^, K 12 . and K^g are appropriate constants for the lift drag polar and u is the 
angle of attack, which is taken as the control variable. 

Combining all the above relations, the state equations for the system become 


*1 ' - x 2 sin *3 ' £ 1 


(583) 


. „ K 5 K 10 V 

X 2 = K 2 SmX 3 -—K— & 


1 X 2 2 ( K 12 + K 13 S ‘° 2u ) ’ f 2 


(584) 


x o = K 

3 2 x_ 


cos X x cos X 

O a u 


K l +X l 


K 5 k io k ii Vl . 

— e x sin u cos u s f 

K 2 o 


(585) 


2 Vl 3 _ , 
X 4 ' K 4 K 5 6 x 2 *4 


(586) 
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(587) 



The problem of optimal control is now formulated as follows. 

"Given the system described by EqB. (583) through (587), calculate the angle of 
attack history, u(t), that will minimize the function (577)." 

The boundary conditions are 

X 1 ( V = h 0 X 1 (t f> = h f 

X 2 <V = V 0 X 2 (t f } = V f 

X 3 <V = 7 0 X 3 <V = 7 f (588) 

* 4 Cti) = 0 

X 5 <V = 0 

This problem was analyzed by Payne, ( 175 ) using the maximum principle. We 
form the hamiltonian 

5 

H = X, f , < 589 > 

3=1 

where the satisfy!" 


tSee Sec. 3. 1.2 
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X 3 = V 2 008 X 3 “ X 2 K 2 cos x 3+ Ag 


r K 2 Binx 3 


x„ sin x„ 
_2 3 

K l +X l 


X 

X 


4 

5 


= 0 
= 0 


(592) 

(593) 

(594) 


In order to minimize J, Eq. (577), we must maximize H, Eq. (598), with respect 
to u. If u is not bounded, it may be obtained from 


3H (x, X, u) 

du 


(595) 


However, a brief examination of the system (583) through (594) indicates that this 
is not a simple task. Thus it is necessary to use some iterative procedure to deter- 
mine u such that H is a maximum at all points along the trajectory. This may be done, 
for example, by computing the optimal u such that the relation 

Maut H (x, X, u) (596) 


is satisfied for all t. It is sufficient to let u assume a range of discrete values and 
calculate u by direct search. It is necessary simultaneously to solve the two-point 
boundary value problem represented by the 10 equations (583) through (587) and (590) 
through (594). Eight boundary conditions are specified by (588), and the remaining 


two are given byt 


i—l 

1 

II 

5r 

r< 

(597) 

x 5 (t f ) = -k 7 

(598) 


Payne^ 175 ) uses a variant of the Neighboring Optimum method^ 119 ) to solve this 
system. The optimal trajectory thus obtained is shown in Figs. 17 through 21, which 
depict the time histories of each of the state variables. The time history of the optimal 
control function is shown in Fig. 22. In each case, the solid line represents the con- 
dition Ky = 0, while the dashed line Is the case for K ? = 200. The most pronounced 
effect of including the acceleration in the criterion function is shown in Fig. 21, which 


tSee Eq. (85) and the discussion in Sec. 3.1.2. 
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VELOCITY (thousands of feet/sec) 


400 




Figure 18. Optimal Trajectories, Time-Velocity 


52 


HEATING RATE (lb-ft/sec) FLIGHT ANGLE (deg) 


n 



Figure 19. Optimal Trajectories, Time-Flight Angle 
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CONTROL (deg) ACCELERATION (g s) 




Figure 22. Optimal Trajectories, Control Function 
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indicates a reduction of about 25 percent in the g forces by using the weighting factor 
of Ky = 200. The heating rate (Fig. 20) is thereby increased, but not significantly. 

Fig. 23 shows how the total heat and acceleration effects along the optimal tra- 
jectory vary with parameter Ky . This plot can be used to determine the minimum 
amount of additional heating that the vehicle must absorb in order to achieve a specified 
reduction in acceleration effects. For example, to reduce the acceleration as shown 
in Fig. 21, it is necessary that the vehicle absorb about 10 percent additional heat 
during the re-entry maneuver . 

The basic data for the problem is given below. 

K = 2.09 X 10 7 ft 

K = 32.2 ft/sec 2 

K = 250 lb sec 2 /ft 
*5 


K 4 = 1.0 x 10 -4 (lb) 1//2 sec 



Figure 23. Optimal Tradeoff, Heat -Acceleration Effects 
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K 5 = 0.052 (lb) 1/2 sec/ft 2 
K g = -4.26 X 10 -5 ft -1 


K 10 

= 66.5 ft 

K 11 

= 1.2 

K 12 

= 0.274 

K 13 

= 1.8 


Initial conditions: 

x i (t.) = 400,000 ft 
x 2 (tj) = 36,000 ft/sec 

x 3 (t.) = 8.09 deg 

Final conditions; 

= 250,000 ft 
x 2 (t f ) = 27,000 ft/sec 
x 3 (t f ) = Odeg 
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APPENDIX A 


COMPUTATIONAL SOLUTION OF TWO-POINT 
BOUNDARY VALUE PROBLEMS 

As noted repeatedly in this monograph, the solution of an optimization problem by 
variational methods or the maximum principle leads to the requirement for solving a 
set of differential equations with two-point boundary conditions. This requirement is 
indeed one of the main limitations of the classical approach, since the solution of bound- 
ary value problems is a formidable computational task. Various solutions have been 
obtained in special cases for particular problemst, but a reasonably general theory 
was not available until fairly recently. 

The basic idea in the method to be described here is that of "quasilinearization, 
in which a nonlinear problem is transformed to a sequence of linear ones by means of 
a "generalized Newton Raphson operator. "(145) the name implies, the method is 
closely akin to the Newton Raphson technique for obtaining the roots of transcendental 
equations by successively replacing the arcs of a curve by its tangents. 

Since the solution of the nonlinear problem is reduced to solving its linear equiva- 
lent, the latter will be considered first. 

Al. THE LINEAR CASE 

Given the matrix vector differential equation 

y = A(t)y + h(t) (Al) 

where y(t) and h(t) are n vectors and A(t) is an nxn matrix. It is required to deter- 
mine y(t) in the interval (0 , tj) such that the boundary conditions 


y j (0) 

= a. 
3 

j = 1, 2, .. . , r 

<A2> 

y j<V 

= b. 

J 


(A3) 

j = r+1 , . . . , n 



are satisfied. 


tCf. Refs. 64, 77, 110, 113, and 120. 
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A simple and direct method for obtaining y(t) is the following. 


Integrate the homogeneous version of Eq. (Al) 
u = A(t) u 

(n-r) times. The initial conditions at the m*'* 1 time are 

u ( m )( 0 ) _ 0 j f r+m 

=1 j = r+m 

This yields the (n-r) vectors 
u (m) (t) , 0 < t < t j 

m = 1 , 2 (n-r) 

Now integrate the equation 

v = A(t)v + h(t) 

using as initial conditions 

v.(0) = a. , j = 1, 2, . . . , r 

J J 

= 0 , j = (r+1) n 

Call this solution 

v(t) , 0<t<t f 

The general solution of Eq. (Al) is then given by 
m 

y,(t) = S C u (W (t) + v (t) 

J p=l P J J 


where the Cp are (scalar) constants determined by solving the (n^ 
equations 


b. 

J 


m 


£ - 


u, 
P 3 


(P) 


(tj) + v.(t f ) 


j = (r+1), • • • ,n 


(147) 

(A4) 

(A5) 

(A6) 

(hi) 

(A8) 

(A9) 

(A10) 

r) linear algebraic 

(All) 
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The distinctive feature of this method is that all integrations are performed with 
initial conditions; it is not necessary to "guess" at final values. The solution is exact 
(within roundoff error) after a finite number of operations. 


A2. THE NONLINEAR CASE 


The vector equation to be solved now takes the form 
x = f(x,t) 


(A12) 


where x and f are n vectors, and the boundary conditions are given by 


X (0) = a 


(A13) 

J J 

j = 1, 2, . . . , r 

x.(t.) = b. 
J r j 


(A 14) 

j = (r+1) , . . . , n 


Let x^(t) be an initial guess for the solution of (A12). 

Successive improvements 


are obtained from 


where 

by 


.(k+D 



, (k) 

ij 


. F< k) (x< k+1 > - x< k >) + f< k) 

(k) \ (k) 

' , t), and F' ; is the Jacobian matrix whose 

df.(x (k) ,t) 

dx. 

J 


(A15) 


..th . 

ij component is given 


(A 16) 


Eq. 


(A15) may also be written as 
x (k+ l) = F (k) x (k+1) + w (k) 


where 

w« = f (k) - F«*« 


(A17) 


(A18 


Eq. (A 17) is a linear equation, which, together with the boundary conditions, (A13) 
and (A14), is completely analogous to the system (Al) - (A3). It may therefore be 
solved by the methods of the previous section. 
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Note that the conventional Newton Raphson method solves the scalar equation 
g(x) = 0 via 


g(x.) 

X i+1 X i g'(x.) 


(A19) 


g'(x) = 


dg 

dx 


x Q s initial guess 
Writing (A19) in the form 

0 = g'(x.) [ x. +i - x. ] + g(x.) (A20) 

The analogy with (A 15) is evident. The generalized Newton Raphson approach is 
conceptually identical to the scalar case in that a curve is successively replaced by 
tangent lines. In other words, the nonlinear problem is replaced by a sequence of 
linear problems. 

A suitable "stopping" criterion is given by 



where c is a predetermined error vector. 

A relevant question at this point is: "What are the. conditions that ensure that this 
procedure converges to the solution of (A12)?" 

It has been shown ^5) that sufficient conditions for convergence are 

1. f.(x,t) is strictly convex - '" for all t in the interval (o, t^) , and 

2. df . (x, t)/d x. > 0 when i / j 


tA function <p(x) is strictly convex in an interval (a,b) if, for x^, x 2> A, with 
a < x 1 < x 2 < b and 0 < A < 1 , we have 

<p [A x 1 _ (1-A) x 2 j < Acp ( Xl ) + (1-A) <p (x 2 ) 

If <p" (x) exists , then <p (x) is strictly convex if c p" (x) > 0. 
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We emphasize the word "sufficient," since these conditions may not be necessary. 
This procedure has been found to converge, for example, when none of the fj(x) were 
convex. Each case must therefore be investigated independently and convergence 
checked by a criterion of the type (A21) unless the aforementioned conditions are satis- 
fied beforehand. It is worthy of note that convergence (when it occurs) is quadratic; 
that is , the number of correct digits approximately doubles at each iteration. Further- 
more, this convergence is monotonic, which means that accurate solutions are obtained 
with relatively few iterations, even with poor initial guesses. 

Thus far, we have assumed that the final time, tf, was known beforehand. It some- 
times happens that tj is not known initially , and this requires a modified treatment. 

One approach is described in Ref. 145, which, however, entails numerical differentia- 
tion — a procedure to be avoided if possible. A superior method, due to Long, (1®®) 
is the following. 

Introduce a new independent variable , s , defined by 
t = as 


where a is a constant to be determined. The new endpoints are taken as s = 0 and s = 1. 
Therefore, once a is determined, the value of tf is given by tj = a. 


Suppose that a typical equation of the system (A12) is x. = L (x , t) . 


If we write 


dx 

ds 


/ 


x 


then we have 

x' = aL(x,as) 


The parameter a is treated as an additional state variable by writing 


/ 

a 


= 0 


Thus the new state vector of the system is of dimension (n+1) , and we proceed as 
before. Note that one makes an initial guess on a ^in x(°)(t)^ . This is therefore con- 
stant during each iteration, but it varies from one iteration to the next. Thus the de- 
termination of tf becomes an integral part of the quasilinearization procedure. 
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APPENDIX B 


THE VARIATION OPERATOR 

Consider the function 

F = F (y, y, t) (Bl) 

where the dot denotes derivative with respect to t, the independent variable. For 
a fixed value of t, F depends only on y and y. 

Suppose now that y is replaced by Y, where 

Y = y + e 17 (t) (B 2 ) 

e = a constant 

The change e 17 (t) in y (t) is called the variation of y (t) and is denoted by 6y; 


6 y = € rj (t) (B3) 

We may further define the variation of y (t) as 

6 y = Y - y (B4) 

which, by virtue of Eq. (B2), becomes 

6 y = € 17 (B5) 

However, from (B3), 

-fY (6 y) = c 1? (B6) 

Comparing (B5) and (B6), we find 

• (It) ‘ ar « (B7) 


In other words, the operators 6 and d/d t are commutative. 
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With y replaced by Y of Eq. (B2), the change in F may be expressed as 
AF = F (y + e 17 , y + € r? , t) - F (y, y, t) 

Taking a Taylor expansion in which only linear terms are retained, this reduces to 


„ dF d F 

AF = — € 77 + — e TJ 
3y Sy 

After substituting Eqs. (B3) and (B5), the resulting expression is defined as 
the (first) variation of F: 


SF aF 

6 F = 6 y + 5y 

ay ay 


(B8) 


For a complete analogy with the definition of a differential, one might perhaps 
anticipate the definition 

_ „ dF . 9F . . 9F _ , 

6 F = — 6 y + — r 6 y + — — 6 t 

ay ay at 


However, t is not varied, so that 6 t = 0. Hence the analogy is indeed complete. 


It is easy to verify from the above definitions that the variation operator obeys 
the same laws as the differentiation operator. Thus 


etc. 


6( Fl F 2 ) = Fl 6F 2 + F 2 6 Fl 



F 2 6F 1- F 1 6F 2 

V 


(B9) 

(BIO) 


The essential distinction between the variation and differential operators is the 
following. The differential of a function is a first-order approximation to the change 
in that function along a particular curve . However, the variation of a function is a 
first-order approximation to the change from curve to curve. 

For further details, we refer the reader to standard texts on the calculus of vari- 
ations. ( 98 > 134 ) 
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APPENDIX C 


INNER PRODUCT IN FUNCTION SPACE 

For any two vectors, a and b, the inner product t is defined by 

n 

a • b s a T b = a b. (Cl) 

1=1 3 J 

while the norm of a vector, a, is defined by 

|| a || = (a . a) 1/2 (C2) 

Two vectors, a and b, are said to be orthogonal if 

a • b = 0 (C3) 

when both a and b are nonzero. 

A set of vectors, |a^| , is termed an orthogonal set if 

& ( r ) t a ( s ) _ q f or r/ 8 (C4) 

If, in addition, the relation 

a ( r ) . a ( r ) _ ! for all r (C5) 

is satisfied, then ja^j is called an orthonormal set of vectors. 

It may be shown that the inner product of two vectors satisfies the Schwartz 
inequality 

II a • b|| * II a|| • ||b|| (C6) 

and that for any two vectors, a and b, the following relations hold. 

|| a || > 0 foralla^O (C7) 

|| a || =0 if, and only if, a = 0 (C8) 

tAlso called scalar or dot product. 
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(C9) 


I! /3 all = /3 II all , 0 = scalar 

Furthermore, the gradients of the scalar 

T T T 

b A a = a A b 

where A is an n x n constant matrix, are given by t 
(b T A a) = A a 

^ T T 

~ (b A a) = A b 

In particular, 

(a T A a) = 2 A a 
da 

The vector 


ll == 

a a 

is called the gradient of the scalar, y . 


d a-. 


9y 
a a„ 


(CIO) 


(Cll) 

(C12) 


(C13) 


(C14) 


tThe superscript T denotes transpose. 


I 
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If the vector, a, is a function of a parameter, t, then 


_d 

dt 



_d 

dt 


(a T a) 


1/2 


- i(a T af 1/2 


E 2 a i »! 


1=1 


which reduces to 


|| a|| = — - (C15) 

INI 

Consider now two functions, f (tj) and g (tj), which are defined for discrete 
values of tj, in an interval, t^, tg, . . . . , tj^. Suppose we define a slightly generalized 
inner product by 

n 

£ f (t.) g (t.) At. 

i = l 


where 


At i = h+l ~ 4 i 


Then, in the limit 


lim S f 

n 00 Ji = 1 

Atj -'O 


) At. 


(f, g) 



f(t) g(t) dt 


(C16) 


The quantity, (f, g), is defined as the inner product of the functions , f (t) and 
g (t), in function space. Loosely speaking, we say that (C16) gives the inner product 
of two vectors in an infinite dimensional Euclidean space. 

The norm of f (t) is defined by 


||f|| = V(f, f) (C17) 
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This satisfies a set of relations completely analogous to (C6) - (CIO); viz., 

II (f, g) II * II fll • II gll 

|| f || >0 for all f ^ 0 

|| f || = 0 if, and only if, f = 0 

II t * gll s II (II + II gll 

|| 0f|| = j3 ||f|| for any scalar 0 

In analogy to (C4) , we call a system of functions [fj] orthogonal if for any 
two functions, f r and f g , 


(f r . g - 


f r (t) f s (t) dt = 0 , r^s 


Furthermore, in analogy with (C5), the system of functions {^3 is orthonormal if 
( f k» g = 1 for all k ( 


For a more comprehensive account of these ideas, the reader is referred to 
standard texts. (^ 5 ^» 162, 163) 
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